I am taking a programming course, and since we have not been introduced to string functions (puts(), gets(), etc) I need to take some input from the user using scanf.

The assignment is to read a DNA sequence, storing it in a char array. The array must be 512 characters, and accept only the characters 'G', 'A', 'T', and 'C'. Space, tab, and enter should be ignored. Additionally, the user enters 'X' to end the DNA sequence. If the user enters any character other than those allowed, the program must return -1 on exit (an error).

The input from the user would look something like: "GATCACTACX"

My problem is that I cannot seem to assign any values to my array. My code so far (just for reading the string) looks like this:

for ( x = 0; x < 512; x++ ) {

        scanf("%c", &sequencetemp);

        if ((sequencetemp=='A') || (sequencetemp=='T') || (sequencetemp=='G') || (sequencetemp=='C')){
            sequence[x] = sequencetemp;
        } else if (sequencetemp=='X'){
            break;
        }else{}

    }

Upon testing, my char array is just full of random junk characters. Is there something I am doing wrong? Does anyone have a better solution? Can I do this assignment without using string functions?

Any help would be appreciated.

Recommended Answers

All 14 Replies

Upon reading some of our textbook "C For Dummies", it really looks like this assignment is asking us for something which is really outside our knowledge. So I guess a solution using string functions is probably fine, if not essential.

Using &quot;scanf&quot; to read character input is a recepie for disaster, however small the purpose maybe. Post your entire code and I will show you an alternative. PS: ALso declare your array as char sequence[ BUFSIZ ] = {\'0'};

I'm afraid that IS my entire code (other than headers and such). I haven't progressed beyond actually reading the user's input at all, because I can't get anything to work.

Oh, and the professor really mustn't be teaching us very good C practices.. we didn't even learn about the null character. heh.

Can you show me the alternative to using scanf? It would be greatly appreciated.

For obtaining single char you can use getchar which returns int. You can read this and this for obtaining the whole string.

For obtaining single char you can use getchar which returns int. You can read this and this for obtaining the whole string.

hmm... fgets seems useful, but how would I modify the method in the second "this" to read a sequence of up to 512 characters, adding only G,A,T, and C to the array, and stopping the input before the X at the end?

Could this be done in a loop?

hmm... fgets seems useful, but how would I modify the method in the second "this" to read a sequence of up to 512 characters, adding only G,A,T, and C to the array, and stopping the input before the X at the end?

Could this be done in a loop?

sorry I didn't read your post carefully. I asume that U must use scanf, so then U must. Ok the problem is your loop. Dont increment x every time.
Check this

for ( x = 0; x < 511; ) { /* x < 511 space for '\0' */                                           
                                                                                                 
   scanf("%c", &sequencetemp); /* dont like this part but if teacher said than ... */            
                                                                                                 
   if ((sequencetemp=='A') || (sequencetemp=='T') || (sequencetemp=='G') || (sequencetemp=='C')){
      sequence[x++] = sequencetemp;                                                              
   }                                                                                             
   else if (sequencetemp=='X'){                                                                  
      sequence[x++] = sequencetemp;                                                              
      sequence[x] = '\0';                                                                        
      break;                                                                                     
   }                                                                                             
   else {                                                                                        
      printf("ERROR\n");                                                                         
      return 1;                                                                                  
   }                                                                                             
}

sorry I didn't read your post carefully. I asume that U must use scanf, so then U must. Ok the problem is your loop. Dont increment x every time.
Check this

for ( x = 0; x < 511; ) { /* x < 511 space for '\0' */                                           
                                                                                                 
   scanf("%c", &sequencetemp); /* dont like this part but if teacher said than ... */            
                                                                                                 
   if ((sequencetemp=='A') || (sequencetemp=='T') || (sequencetemp=='G') || (sequencetemp=='C')){
      sequence[x++] = sequencetemp;                                                              
   }                                                                                             
   else if (sequencetemp=='X'){                                                                  
      sequence[x++] = sequencetemp;                                                              
      sequence[x] = '\0';                                                                        
      break;                                                                                     
   }                                                                                             
   else {                                                                                        
      printf("ERROR\n");                                                                         
      return 1;                                                                                  
   }                                                                                             
}

This still doesn't give any input to the sequence array... when i try printing it, all of the values are still null.

I think I am just going to abandon scanf. Could you show me how to read the string (ending before X, GATC only, all others error) with the fgets() function?

What condition should occur if the user enters 'X' without filling the entire 512 element char array ? What if he enters the third character as 'X' then what should happen, what should the remaining array contain ?

If you using scanf than:
input: G (hit enter) X (hit enter) your output GX
If using getchar
input: GX your output GX.

#include <stdio.h>

int main()
{
   int sequencetemp;
   char sequence[512];
   unsigned short x;
   
   for ( x = 0; x < 511; ) { /* x < 511 space for '\0' */
   
      sequencetemp = getchar();

      if ((sequencetemp=='A') || (sequencetemp=='T') || (sequencetemp=='G') || (sequencetemp=='C')){
         sequence[x++] = sequencetemp;
      } 
      else if (sequencetemp=='X'){
         sequence[x++] = sequencetemp;
         sequence[x] = '\0';
         break;
      }
      else {
         printf("ERROR\n");
         return 1;
      }
   }
   puts("The string is: ");
   puts(sequence);
    
   return 0;
}

I have actually finished the program using gets() and a bunch of other string/char functions. :mrgreen:
There's only one more problem. We're not allowed to use global variables.:sad:
The assignment is really starting to piss me off.

Is there a way to pass strings between functions without using global variables? If so, can it be used here, without drastic modification of my existing functions?

Here's the finished code w/ global variables, for reference:

#include <stdio.h>
#include <string.h>
#include <ctype.h>

char initial_sequence[513] = {'\0'};
char fix_sequence[513] = {'\0'};

int transfer_sequence (void)
/*   
    This function takes no arguments and returns 1 if it finds an 
    unacceptable character in the sequence, 0 if the sequence 
    is a valid one. Its function is to take the user's input, discount
    spaces and tabs and other blank characters, search for 
    unacceptable characters, and transfer it into a new array,
    stopping the sequence before the 'X' used to denote the end.
*/
{

    int x; /*generic counters*/
    int y;
    int unacceptable_char;

    unacceptable_char=0;
    x = 0;
    y = 0;

    for ( x = 0; x <512; x++ ){
        if (!(isspace(initial_sequence[x]))){
            if (initial_sequence[x] != 'X')
                fix_sequence[y++]=initial_sequence[x];
        }
        if((fix_sequence[y - 1] != 'G') && (fix_sequence[y - 1] != 'A') && (fix_sequence[y - 1] != 'T') && (fix_sequence[y - 1] != 'C') && (fix_sequence[y - 1] != 'X')){
            unacceptable_char = 1;
        }
        if (initial_sequence[x] == 'X'){
            break;
        }
    }    

    return unacceptable_char;    
}

int check_gene (void)
/*
    This function takes no arguments and returns 1 if it finds a 
    specific gene in the sequence, 0 if the sequence does not
    contain the gene. Its function is to examine the sequence
    in its entirety, looking for the letters "TTGACA" and "TATAAT"
    occurring while separated by exactly 19 other characters.
*/    
{
    int gene_found;
    int x; /* generic counter */

    x = 0;
    gene_found = 0;

    for (x=0; x < 512; x++){
        if (((fix_sequence[x]) == 'T') && ((fix_sequence[x + 1]) == 'T') && ((fix_sequence[x + 2]) == 'G') && ((fix_sequence[x + 3]) == 'A') && ((fix_sequence[x + 4]) == 'C') && ((fix_sequence[x + 5]) == 'A')){
            if (((fix_sequence[x+25]) == 'T') && ((fix_sequence[x + 26]) == 'A') && ((fix_sequence[x + 27]) == 'T') && ((fix_sequence[x + 28]) == 'A') && ((fix_sequence[x + 29]) == 'A') && ((fix_sequence[x + 30]) == 'T')){
                gene_found = 1;
                break;
            }
        }
    }

    return gene_found;
}

int main (void)

/*     This function takes no arguments, and returns 0 or 1 to the OS, 
    depending upon successful completion. It reads a string of input from
    the user, converts it into a useable DNA sequence, and then checks
    for invalid characters. Then, if a specific gene is found, it notifies the 
    user. When finding an invalid character in the DNA sequence, it 
    returns a value of -1. If not, it returns 0.*/

{
    int unacceptable_char;
    int gene_found;
    
    gene_found = 0;
    unacceptable_char = 0;
    
    printf("Please enter your DNA Sequence (X to mark the end of the sequence).\n");
    gets(initial_sequence);                                                                                 

    unacceptable_char = transfer_sequence(); /*check for bad chars*/

    gene_found = check_gene(); /* check for gene */

    printf("\n");

    if ((gene_found) && !(unacceptable_char)){
        printf("\nGene found!\n\n");
    } else if (!(unacceptable_char)) {
        printf("\nNo gene found!\n\n");
    } else {
        printf("\nInvalid character found in sequence.\n\n");
    }

    if (!(unacceptable_char)){
        return (0);
    }

    return -1;
}

Declare the both globals as locals inside the main func.

int transfer_sequence (char * initial_sequence, char * fix_sequence);
int check_gene (char * fix_sequence);
int main()
{
   char initial_sequence[513] = {'\0'};
   char fix_sequence[513] = {'\0'};
   /* stuff */
   unacceptable_char = transfer_sequence(initial_sequence,    fix_sequence); /*check for bad chars*/

   gene_found = check_gene(fix_sequence); /* check for gene */
   /* rest of stuff */
   return 0;
}

Declare the both globals as locals inside the main func.

int transfer_sequence (char * initial_sequence, char * fix_sequence);
int check_gene (char * fix_sequence);
int main()
{
   char initial_sequence[513] = {'\0'};
   char fix_sequence[513] = {'\0'};
   /* stuff */
   unacceptable_char = transfer_sequence(initial_sequence,    fix_sequence); /*check for bad chars*/

   gene_found = check_gene(fix_sequence); /* check for gene */
   /* rest of stuff */
   return 0;
}

wow. I didn't think that would work, but it totally did! Thanks a bunch, Daniweb guys.

well C uses ASCII code and 'A' and 'a' are treated differently. if u enter a instead of A, nothing is assigned to the array and the subscipt is simply incremented. so by default a junk value is stored in that location.may be this is what is happening to ur program.

i have just assumed this possibility.

well C uses ASCII code and 'A' and 'a' are treated differently. if u enter a instead of A, nothing is assigned to the array and the subscipt is simply incremented. so by default a junk value is stored in that location.may be this is what is happening to ur program.

i have just assumed this possibility.

Thanks for the... err.. "insightful" comment, but the initial problem was solved long ago, and trust me, I am not foolish enough to consistently enter lower-case letters in my own program.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.