| | |
searching for keywords in multiple text files
Please support our C++ advertiser: Intel Parallel Studio Home
Thread Solved |
•
•
Join Date: Jan 2009
Posts: 7
Reputation:
Solved Threads: 0
Hello,
I'm working on a code for my project at college. The goal of the project is to find and extract keywords, and the sentences, which contain these keywords from many text files, which I have already downloaded from Internet using another code. These text files are actually source codes of different websites and my program needs to search for certain keywords, which I store in another text file called "keywords.txt". It should also search for all keywords in all text files. So I tried to do it using some while-loops. Although I got some results, my code only searchs for the first keywords, that lies on the top line of "keywords.txt" and other keywords are unfortunately not searched. I can search only this one keywords in all text files; I get the places and names of these text files from a file called "addresses.txt". Could you please look at my code and tell me what could be wrong about it and what should I do for my code to search for all keywords in "keywords.txt"?
I would also appreciate some hints about how I could manage this process using vector class, since using arrays is not really appropriate, because I have to change the size of arrays everytime when I add new addresses or keywords, manually. I have some knowledge of vectors, but I couldn't implement it into my code. Here is the code I have written:
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Thanks for your help
I'm working on a code for my project at college. The goal of the project is to find and extract keywords, and the sentences, which contain these keywords from many text files, which I have already downloaded from Internet using another code. These text files are actually source codes of different websites and my program needs to search for certain keywords, which I store in another text file called "keywords.txt". It should also search for all keywords in all text files. So I tried to do it using some while-loops. Although I got some results, my code only searchs for the first keywords, that lies on the top line of "keywords.txt" and other keywords are unfortunately not searched. I can search only this one keywords in all text files; I get the places and names of these text files from a file called "addresses.txt". Could you please look at my code and tell me what could be wrong about it and what should I do for my code to search for all keywords in "keywords.txt"?
I would also appreciate some hints about how I could manage this process using vector class, since using arrays is not really appropriate, because I have to change the size of arrays everytime when I add new addresses or keywords, manually. I have some knowledge of vectors, but I couldn't implement it into my code. Here is the code I have written:
--------------------------------------------------------------------------------
C++ Syntax (Toggle Plain Text)
#include <iostream> #include <fstream> #include <string> #include <sstream> // string stream class'ı #include <vector> using namespace std; void Search_Keyword(void) { int j = 0; int i = 0; size_t position = 0; string line; string keyword; string keyword_Array[10]; string name_Array[13]; string link_Array[13]; string link; string name; ifstream addresses("addresses.txt"); ifstream keywords("keywords.txt"); keyword = ""; while(keywords>>keyword) { keyword_Array[j] = keyword; string search_Str = keyword_Array[j]; while(addresses>>link>>name) { name_Array[i] = "URLS/"; name_Array[i] += name; name_Array[i] += ".txt"; link_Array[i] = link; ifstream url_txt(name_Array[i].c_str()); while(getline(url_txt,line)) { string word; istringstream myObj(line); while(myObj>>word) { if ((word == search_Str)) { cout<<"found "<<search_Str<<" in "<<name_Array[i]<<endl; } } } url_txt.close(); if(i < 12) i++; } j++; } addresses.close(); keywords.close(); } int main() { Search_Keyword(); return 0; }
Thanks for your help
•
•
Join Date: Jan 2009
Posts: 79
Reputation:
Solved Threads: 12
I'm not sure if this will work, but try
C++ Syntax (Toggle Plain Text)
while(keywords>>keyword != NULL)
•
•
Join Date: Nov 2007
Posts: 390
Reputation:
Solved Threads: 39
If you are using VS2005+ press ctrl+a+k then f. This will run a macro to format all of your indents perfectly =)
•
•
Join Date: Nov 2007
Posts: 390
Reputation:
Solved Threads: 39
As for the actual problem, I would go with a simpler method. Simply push all the keywords into a vector of strings, then load in the file to check against (a txt file) into another vector of strings. Then filter out the keywords manually, or get a little more elegant with a function such as SetIntersection() to find the similarities between the two, pushed into another vector. You can then use the resultant vector to do whatever parsing you wish (add the url info, etc)
This will cut your code down by about 70%, and will get you brownie points with your teacher =)
This will cut your code down by about 70%, and will get you brownie points with your teacher =)
Last edited by skatamatic; Jan 9th, 2009 at 2:37 pm.
•
•
Join Date: Jan 2009
Posts: 7
Reputation:
Solved Threads: 0
•
•
•
•
As for the actual problem, I would go with a simpler method. Simply push all the keywords into a vector of strings, then load in the file to check against (a txt file) into another vector of strings. Then filter out the keywords manually, or get a little more elegant with a function such as SetIntersection() to find the similarities between the two, pushed into another vector. You can then use the resultant vector to do whatever parsing you wish (add the url info, etc)
This will cut your code down by about 70%, and will get you brownie points with your teacher =)
•
•
Join Date: Nov 2007
Posts: 390
Reputation:
Solved Threads: 39
•
•
•
•
Thanks for the explanation, but I still don't understand how I can load text files, which I need to search, into a string vector. Should I get them line by line and load into a vector using "getline" or is there another thing to do?
C++ Syntax (Toggle Plain Text)
ifstream inFile; vector<string> data; inFile.open("File.txt"); while (!inFile.eof()) { string sString = inFile.getline(); data.push_back(sString); }
Something like that should do the trick for filling a vector from a file. It might not be syntactically correct, since I didn't try to compile it.
•
•
•
•
C++ Syntax (Toggle Plain Text)
ifstream inFile; vector<string> data; inFile.open("File.txt"); while (!inFile.eof()) { string sString = inFile.getline(); data.push_back(sString); }
Simplified:
C++ Syntax (Toggle Plain Text)
ifstream file("name"); vector<string> lines; string str; if(file.is_open()) { while(getline(file, str)) { lines.push_back(str); } file.close(); } else { failed. set error events, logs, etc. }
"Jedenfalls bin ich überzeugt, daß der Alte nicht würfelt."
"I became very sensitive to what will happen to all this and all of us." -Two geniuses named Albert
"I became very sensitive to what will happen to all this and all of us." -Two geniuses named Albert
![]() |
Similar Threads
- Help - surf sidekick 3 is attacking! (Viruses, Spyware and other Nasties)
Other Threads in the C++ Forum
- Previous Thread: Converting digits..need help!
- Next Thread: Having trouble with rand();
| Thread Tools | Search this Thread |
api array based beginner binary c++ c/c++ calculator char char* class classes code compile compiler console conversion count delete deploy desktop directshow dll download dynamic dynamiccharacterarray email encryption error file forms fstream function functions game givemetehcodez google graph gui homeworkhelp homeworkhelper iamthwee ifstream input int integer java lib linkedlist linker linux list loop looping loops map math matrix memory news node numbertoword output parameter pointer problem program programming project python random read recursion recursive reference return rpg sorting string strings struct temperature template templates test text text-file tree unix url variable vector video visualstudio win32 windows winsock word wordfrequency wxwidgets






