Here is what i got so far. I got stuck at white-space separated words from a file, convert all words to a single case, and remove any non-alphanumeric characters from both ends of the words. It is to count the number of each word, and then write to a file a list of word/count pairs, sorted by word. dont know what to do. can someone give a ahead start?
i am stuck

#include <stdio.h>

/* counts words in the file */
int main(void) /* C89 ANSI */
    FILE *fp;
    int r; /* a variable for result of a function, returning int */
    size_t n; /* the words counter */
    const char *filename = "test.txt"; /* a file name opening for read */
    char word[100]; /* an array for the check if a non-empty word was read */

    if ((fp = fopen(filename, "r")) == NULL) {
        fprintf(stderr, "error: file" "\n");
        return 1;
    /* if can't open the file for read
     then print an error message and return false to the environment */

    n = 0; /* turn the counter of words to zero */
    word[0] = '\0'; /* turn the word array to an empty state */
    while ((r = fscanf(fp, "\t%99[^ \n,]%*c", word)) != EOF) {
        if (word[0] != '\0')
        /* if the word array got something,
         then it was a word, count it */

        word[0] = '\0'; /* turn the word back into an empty state */
    /* skipping words delimeted by ' ' or '\n' or ','
     while file fp can be read, continue skipping
     and count every skip */

    if (ferror(fp) != 0) { /* check the file for read error if EOF occured */
        fprintf(stderr, "error: read file" "\n");
        return 1;
    /* if there was an error while reading the file
     then print error, close the file (because it was opened though)
     and return false to the environment */

    if (n == 1) /* control "to be" and endings for word or words */
        printf("there is %lu word" "\n", n);
        printf("there are %lu words" "\n", n);

    fclose(fp); /* close the file */

    return 0; /* return success to the environment */

Write out the algorithm first in pseudo-code. If words can only contain alpha characters (no numbers, special characters, etc), then incorporate that into the pseudo-code, skipping past the next white-space character(s). Myself, I would read the file one character at a time, and use a finite-state machine algorithm to accumulate characters into a word until either a disallowed character, or whitespace (including carriage returns and linefeeds) is encountered. If whitespace is encountered, then the word is looked up in the dictionary and the instance count is incremented, or if not found, then added to the dictionary with an instance count of 1.

Once the pseudo-code is satisfactory, then you can start coding. This is a good example of why starting with code is not a good idea. It just obfuscates the issues you need to deal with.

This article has been dead for over six months. Start a new discussion instead.