943,747 Members | Top Members by Rank

Ad:
  • C++ Discussion Thread
  • Marked Solved
  • Views: 1306
  • C++ RSS
Jan 7th, 2009
0

searching for keywords in multiple text files

Expand Post »
Hello,
I'm working on a code for my project at college. The goal of the project is to find and extract keywords, and the sentences, which contain these keywords from many text files, which I have already downloaded from Internet using another code. These text files are actually source codes of different websites and my program needs to search for certain keywords, which I store in another text file called "keywords.txt". It should also search for all keywords in all text files. So I tried to do it using some while-loops. Although I got some results, my code only searchs for the first keywords, that lies on the top line of "keywords.txt" and other keywords are unfortunately not searched. I can search only this one keywords in all text files; I get the places and names of these text files from a file called "addresses.txt". Could you please look at my code and tell me what could be wrong about it and what should I do for my code to search for all keywords in "keywords.txt"?

I would also appreciate some hints about how I could manage this process using vector class, since using arrays is not really appropriate, because I have to change the size of arrays everytime when I add new addresses or keywords, manually. I have some knowledge of vectors, but I couldn't implement it into my code. Here is the code I have written:
--------------------------------------------------------------------------------
C++ Syntax (Toggle Plain Text)
  1. #include <iostream>
  2. #include <fstream>
  3. #include <string>
  4. #include <sstream> // string stream class'ı
  5. #include <vector>
  6. using namespace std;
  7.  
  8. void Search_Keyword(void)
  9. {
  10. int j = 0;
  11. int i = 0;
  12. size_t position = 0;
  13. string line;
  14. string keyword;
  15. string keyword_Array[10];
  16. string name_Array[13];
  17. string link_Array[13];
  18. string link;
  19. string name;
  20. ifstream addresses("addresses.txt");
  21. ifstream keywords("keywords.txt");
  22. keyword = "";
  23.  
  24. while(keywords>>keyword)
  25. {
  26. keyword_Array[j] = keyword;
  27. string search_Str = keyword_Array[j];
  28.  
  29. while(addresses>>link>>name)
  30. {
  31. name_Array[i] = "URLS/";
  32. name_Array[i] += name;
  33. name_Array[i] += ".txt";
  34. link_Array[i] = link;
  35. ifstream url_txt(name_Array[i].c_str());
  36. while(getline(url_txt,line))
  37. {
  38. string word;
  39. istringstream myObj(line);
  40. while(myObj>>word)
  41. {
  42. if ((word == search_Str))
  43. {
  44. cout<<"found "<<search_Str<<" in "<<name_Array[i]<<endl;
  45. }
  46. }
  47. }
  48. url_txt.close();
  49. if(i < 12)
  50. i++;
  51. }
  52.  
  53. j++;
  54. }
  55. addresses.close();
  56. keywords.close();
  57. }
  58.  
  59. int main()
  60. {
  61. Search_Keyword();
  62. return 0;
  63. }
--------------------------------------------------------------------------------

Thanks for your help
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
serhannn is offline Offline
7 posts
since Jan 2009
Jan 7th, 2009
0

Re: searching for keywords in multiple text files

I'm not sure if this will work, but try

C++ Syntax (Toggle Plain Text)
  1. while(keywords>>keyword != NULL)
Reputation Points: 21
Solved Threads: 12
Junior Poster in Training
MatEpp is offline Offline
79 posts
since Jan 2009
Jan 9th, 2009
0

Re: searching for keywords in multiple text files

Thanks for the reply, but it doesn't work..
Reputation Points: 10
Solved Threads: 0
Newbie Poster
serhannn is offline Offline
7 posts
since Jan 2009
Jan 9th, 2009
0

Re: searching for keywords in multiple text files

> while(addresses>>link>>name)
Having got to the end of the addresses file once, where do you think you'll start off from with the second word from the keywords file?

Even for a short program, your indentation is mis-leading and needs work.
Team Colleague
Reputation Points: 5862
Solved Threads: 950
Posting Sage
Salem is offline Offline
7,164 posts
since Dec 2005
Jan 9th, 2009
0

Re: searching for keywords in multiple text files

Click to Expand / Collapse  Quote originally posted by Salem ...
> while(addresses>>link>>name)
Having got to the end of the addresses file once, where do you think you'll start off from with the second word from the keywords file?

Even for a short program, your indentation is mis-leading and needs work.
If you are using VS2005+ press ctrl+a+k then f. This will run a macro to format all of your indents perfectly =)
Reputation Points: 352
Solved Threads: 109
Master Poster
skatamatic is offline Offline
775 posts
since Nov 2007
Jan 9th, 2009
0

Re: searching for keywords in multiple text files

As for the actual problem, I would go with a simpler method. Simply push all the keywords into a vector of strings, then load in the file to check against (a txt file) into another vector of strings. Then filter out the keywords manually, or get a little more elegant with a function such as SetIntersection() to find the similarities between the two, pushed into another vector. You can then use the resultant vector to do whatever parsing you wish (add the url info, etc)

This will cut your code down by about 70%, and will get you brownie points with your teacher =)
Last edited by skatamatic; Jan 9th, 2009 at 2:37 pm.
Reputation Points: 352
Solved Threads: 109
Master Poster
skatamatic is offline Offline
775 posts
since Nov 2007
Jan 10th, 2009
0

Re: searching for keywords in multiple text files

Click to Expand / Collapse  Quote originally posted by skatamatic ...
As for the actual problem, I would go with a simpler method. Simply push all the keywords into a vector of strings, then load in the file to check against (a txt file) into another vector of strings. Then filter out the keywords manually, or get a little more elegant with a function such as SetIntersection() to find the similarities between the two, pushed into another vector. You can then use the resultant vector to do whatever parsing you wish (add the url info, etc)

This will cut your code down by about 70%, and will get you brownie points with your teacher =)
Thanks for the explanation, but I still don't understand how I can load text files, which I need to search, into a string vector. Should I get them line by line and load into a vector using "getline" or is there another thing to do?
Reputation Points: 10
Solved Threads: 0
Newbie Poster
serhannn is offline Offline
7 posts
since Jan 2009
Jan 10th, 2009
0

Re: searching for keywords in multiple text files

Click to Expand / Collapse  Quote originally posted by serhannn ...
Thanks for the explanation, but I still don't understand how I can load text files, which I need to search, into a string vector. Should I get them line by line and load into a vector using "getline" or is there another thing to do?
You can load the files in using the fstream.

C++ Syntax (Toggle Plain Text)
  1. ifstream inFile;
  2. vector<string> data;
  3. inFile.open("File.txt");
  4. while (!inFile.eof())
  5. {
  6. string sString = inFile.getline();
  7. data.push_back(sString);
  8. }

Something like that should do the trick for filling a vector from a file. It might not be syntactically correct, since I didn't try to compile it.
Reputation Points: 352
Solved Threads: 109
Master Poster
skatamatic is offline Offline
775 posts
since Nov 2007
Jan 10th, 2009
0

Re: searching for keywords in multiple text files

Click to Expand / Collapse  Quote originally posted by skatamatic ...
C++ Syntax (Toggle Plain Text)
  1. ifstream inFile;
  2. vector<string> data;
  3. inFile.open("File.txt");
  4. while (!inFile.eof())
  5. {
  6. string sString = inFile.getline();
  7. data.push_back(sString);
  8. }
Don't bother with eof(), and don't put declarations in loops.
Simplified:
C++ Syntax (Toggle Plain Text)
  1. ifstream file("name");
  2. vector<string> lines;
  3. string str;
  4.  
  5. if(file.is_open())
  6. {
  7. while(getline(file, str))
  8. {
  9. lines.push_back(str);
  10. }
  11.  
  12. file.close();
  13. }
  14. else
  15. {
  16. failed. set error events, logs, etc.
  17. }
Reputation Points: 888
Solved Threads: 114
Nearly a Posting Virtuoso
MosaicFuneral is offline Offline
1,270 posts
since Nov 2008
Jan 10th, 2009
0

Re: searching for keywords in multiple text files

thanks, everyone. I have solved the problem =) It was a basic loop error..
Reputation Points: 10
Solved Threads: 0
Newbie Poster
serhannn is offline Offline
7 posts
since Jan 2009

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in C++ Forum Timeline: Converting digits..need help!
Next Thread in C++ Forum Timeline: Having trouble with rand();





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC