Hello everyone, I need a little help from you guys.

I need to find the position of some strings , these strings store in a file named "queryfile" , from an other file named "datafile".
However, my programe just work on a single "word" , can't find the position of phrase or sentence.
Thank you so much!

my program

#include <stdio.h>
#include <string.h>


int main()
{
    FILE *queryfile;
    queryfile = fopen("op2query.txt","r");
     FILE *datafile;
     datafile = fopen("op2data.txt","r" );
    int i = 1;
    char word[99];
    char search[99];

    if(queryfile==NULL) {
    printf("Error in reading Query File");
    }

    if(datafile==NULL) {
    printf("Error in reading Data File");
    }

      while (!feof(queryfile)) {     
        fscanf(queryfile,"%s", &search);
           while(!feof(datafile)){  
              fscanf(datafile,"%s", &word);
              if (strcmp(word,search)==0){
              printf("\n %i %s ", i, search);
              rewind(datafile);
              i=1;
              break;
              }
               else  i++;   
               }
      }

    fclose(datafile);
    fclose(queryfile);

    return 0;
    }

op2query.txt
wisdom
season
age of foolishness

op2data.txt
it was the best of times it was the worst of times it was the age
of wisdom it was the age of foolishness it was the epoch of belief
it was the epoch of incredulity it was the season of light it was
the season of darkness it was the spring of hope it was the winter
of despair

result
18 wisdom
40 season
16 age
5 of
24 foolishness

THANK YOU!!!!!

I think you shouldn't use "fscanf" for get a string which maybe contain space character. because 'fscanf' will stop if it hit space character. Use fgets instead.

fgets(search,sizeof(search),querryfile);
fgets(word,sizeof(word),datafile);

Another note that a name of a string works like a constant pointer to char, so when pass to "scanf", or any function that require pointer to char, you should remove the '&'.

Edited 1 Year Ago by Gà_1

Thanks, But it's not working...
I do not have any result after i changed it.

Comments
Tomorow I will fix for you. When use fgets at the query file, the str will contain '\n', and you cannot just search word by word for a str.

The problem not as simple as you think. You cannot just read word-by-word, because the string in queryFile maybe contain "space" character.

Here is the result when I run your test case:
op2query.txt
wisdom
season
age of foolishness

op2data.txt
it was the best of times it was the worst of times it was the age
of wisdom it was the age of foolishness it was the epoch of belief
it was the epoch of incredulity it was the season of light it was
the season of darkness it was the spring of hope it was the winter
of despair

Result:
18 wisdom
40 season
22 age of foolishness

Attention! Both of your file must contain a empty line at the end.

Here is my code for your problem, it work very well with all kind of queryFile. I think it self-explain but if you found something not clear please ask me:

#include<stdio.h>
#include<string.h>

static const char *queryLink="d:\\file\\text\\op2query.txt";
static const char *dataLink="d:\\file\\text\\op2data.txt";

int freads(char *,int,FILE *);
int countWord(const char *);

int main(void) {
  FILE *queryFile=fopen(queryLink,"r");
  FILE *dataFile=fopen(dataLink,"r");
  if(!queryFile||!dataFile) {
    puts("Error on reading file(s).");
    getchar();
    return -1;
  }
  do {
    char key[500];
    if(freads(key,sizeof(key)+1,queryFile)==-1)
      strcpy(key,"");
    int loc=0;
    do {
      loc++;
      char scan[500]="",temp[500];
      fscanf(dataFile,"%s",temp);
      strcat(scan,temp);
      int pos=(int)ftell(dataFile);
      for(int i=1;i<countWord(key);i++) {
        if(fscanf(dataFile,"%s",temp)==EOF)
          strcpy(temp,"");
        strcat(scan," ");
        strcat(scan,temp);
      }
      if(!strcmp(scan,key)) {
        printf("%d %s\n",loc,scan);
        break;
      }
      scan[0]='\0';
      if(fgetc(dataFile)!=EOF)
        fseek(dataFile,pos,SEEK_SET);
    }
    while(!feof(dataFile));
    rewind(dataFile);
  }
  while(!feof(queryFile));
  fclose(queryFile);
  fclose(dataFile);
  return 0;
}

int freads(char *s,int n,FILE *f) {
  if(!fgets(s,n,f))
    return -1;
  s[strlen(s)-1]='\0';
  return 0;
}
int countWord(const char *s) {
  int r=1;
  for(int i=0;i<(int)strlen(s);i++)
    if(s[i]==32)
      r++;
  return r;
}

My idea is quite simple, first, check how many words in "key" (string from queryFile). Then find in dataFile any string that match the given number of words, from the begin til the end of dataFile (use "scan" string to compare with the "key", in the begin, mark an "pos", if two strings not equal, reset the "scan" and seek back "pos" to go for next string). If found, escape from inner loop, rewind the dataFile for the next "key".

These are the main ideas, other details in the above code is just for error handling. Sorry for my terrible grammar.

Edited 1 Year Ago by Gà_1: more details

  1. Read the data file into a single string, replacing new lines with spaces.
  2. For each string in the query file, use strstr() to find the position in the string from the data file.
  3. Done.
Comments
Much better than my idea. But can you please explain more how can we find "position by number" using strstr().

The strstr() function returns a pointer to the start of the results string that matches the target string in the source string. Since they are both pointers, you simply subtract the source string address from the results string (pointer math) and you get the offset of the target in the source in bytes. IE, this will be the position of your target. This is the most efficient method, and one I use regularly with log file data in order to extract the bits I am interested in. See the man page for the strstr() function. If you need a copy, let me know and I'll post it here.

Here is an example:

char* pStart = source_string_address;
char* pTarget = target_string_address;
char* pResults = strstr(pStart, pTarget);
// Assuming that pResults is not null
ssize_t results_location = pResults - pStart;

Edited 1 Year Ago by rubberman

Comments
Seem that your method is for counting (find pos) at characters. Can you please expain more about how this method work at counting at words(in OP's pb)

Keep a count with the target_strings (terms) in a map. If you get results from a search term, you take pResults + 1 and search again, adding to the count for that term, looping with this until strstr() returns null. Then you go to the next term, and start again. So, this can be done in a dual nested loop, the outer iterating through the map of terms, and the inner searching for that term until null is returned by strstr(). Try this, and show what you have done for that algorithm.

This article has been dead for over six months. Start a new discussion instead.