943,718 Members | Top Members by Rank

Ad:
  • C++ Discussion Thread
  • Marked Solved
  • Views: 624
  • C++ RSS
Aug 7th, 2009
0

I ran out of vector space?

Expand Post »
Here is my code:

C++ Syntax (Toggle Plain Text)
  1. #include <iostream>
  2. #include <string>
  3. #include <vector>
  4. #include <algorithm>
  5. #include <fstream>
  6. #include <stdlib.h>//(for atoi to work)
  7.  
  8. using namespace std;
  9.  
  10. void usage()
  11. {
  12. cout << "Usage: <input1> <input2> <output>\n";
  13. cout << "\n see README for more details.\n";
  14. exit(1);
  15. }
  16.  
  17. int main(int argc, char *argv[])
  18. {
  19. cout << "\nshmoosh - concatenates and uniques wordlists into one\n";
  20.  
  21. if(argc!=4)
  22. usage();
  23.  
  24. vector<string> vec_wordlist_compilation;
  25.  
  26. ///////////////////////////input1//////////////////////////////
  27.  
  28. ifstream wordlistfile(argv[1]);
  29. if(!wordlistfile.is_open())
  30. {
  31. cout<<"\nError opening file \'"<<argv[1]<<"\'\n";
  32. exit(1);
  33. }
  34. int x=0;
  35. string word;
  36. while(getline(wordlistfile,word)){
  37. vec_wordlist_compilation.push_back(word);
  38. x++;
  39. }
  40. cout << x << " words loaded from file \'"<<argv[1]<<"\'\n";
  41.  
  42. wordlistfile.close();
  43.  
  44. ///////////////////////////input2//////////////////////////////
  45.  
  46. ifstream wordlistfiletwo(argv[2]);
  47. if(!wordlistfiletwo.is_open())
  48. {
  49. cout<<"\nError opening file \'"<<argv[2]<<"\'\n";
  50. exit(1);
  51. }
  52. int v=0;
  53. while(getline(wordlistfiletwo,word)){
  54. vec_wordlist_compilation.push_back(word);
  55. v++;
  56. }
  57. cout << v << " words loaded from file \'"<<argv[2]<<"\'\n";
  58.  
  59. wordlistfiletwo.close();
  60.  
  61. ////////////////////////////sort//////////////////////////////
  62. cout << "\nsorting " << v+x << " words, removing duplicates...\n";
  63.  
  64. //sort vector (least to greatest)...
  65. sort(vec_wordlist_compilation.begin(),vec_wordlist_compilation.end());
  66. //remove duplicates...
  67.  
  68.  
  69. vec_wordlist_compilation.resize((unique(vec_wordlist_compilation.begin(),vec_wordlist_compilation.end()))-vec_wordlist_compil
  70.  
  71. ation.begin());
  72.  
  73. /*for(unsigned int c=0;c<vec_wordlist_compilation.size();c++)
  74. cout << vec_wordlist_compilation[c] << "\n";*/
  75. cout << vec_wordlist_compilation.size() << " unique words remain.\n";
  76.  
  77. ////////////////////////////output//////////////////////////////
  78.  
  79. ofstream output(argv[3]);
  80. for(unsigned int c=0;c<vec_wordlist_compilation.size();c++)
  81. output << vec_wordlist_compilation[c] << "\n";
  82.  
  83. return 0;
  84. }

I made it to put two wordlists together and remove duplicates. The problem is that it crashes on large wordlists. I cannot say exactly how many words it takes without crashing, but somewhere around 200 megs, it crashes when loading the wordlists. I have 8 gigs of ram, so I know it's not running out of space. Is there a limitation (in MB) that a C++ vector can hold? Is there a way around this? If not, does anybody know of some library which will let me do this?

I thought about making a version that writes a temporary file to the hard drive and scans the file for every new word to majke sure it is not in there already, but I figured this would be waaaaay too slow.

Can anybody help?
Similar Threads
Reputation Points: 10
Solved Threads: 0
Light Poster
dzhugashvili is offline Offline
35 posts
since Jun 2009
Aug 7th, 2009
0

Re: I ran out of vector space?

A std::vector is guaranteed to maintan contiguous data space -- meaning it cannot handle really large data.

Use a std::deque instead. I looks much the same, but the data need not be stored contiguously -- meaning it can handle a great deal larger amount of data (because it can work with the OS/compiler's memory management more flexibly).

BTW, you shouldn't be using atoi(). Use a stringstream instead...
C++ Syntax (Toggle Plain Text)
  1. #include <sstream>
  2. #include <stdexcept>
  3. #include <string>
  4.  
  5. int myatoi( const std::string& s )
  6. {
  7. int result;
  8. std::istringstream ss( s );
  9. ss >> result;
  10. if (!ss.eof()) throw std::runtime_error( "not an integer" );
  11. return result;
  12. }
Untested!

Hope this helps.
Featured Poster
Reputation Points: 1140
Solved Threads: 229
Postaholic
Duoas is offline Offline
2,039 posts
since Oct 2007
Aug 7th, 2009
0

Re: I ran out of vector space?

>>I have 8 gigs of ram, so I know it's not running out of space

32-bit programs can not access all that memory at one time. Each 32-bit program is limited to about 2 gig ram.
Sponsor
Team Colleague
Featured Poster
Reputation Points: 5608
Solved Threads: 2282
Retired and Enjoying Life
Ancient Dragon is offline Offline
21,950 posts
since Aug 2005
Aug 7th, 2009
0

Re: I ran out of vector space?

I created a 1.2 gig text file that contained 10-character words (generated randomly). Then tried to read it into a std::list. The program crashed after reading just over 24 million words. Changed the program to use deque instead of list, and it read even fewer words before crashing. (my computer is running vista home, has 5 gig ram, and used vc++ 2008 express compiler/IDE)

Of course it would have been easier to check by calling the list's max_size() method. For deque
Quote ...
maxsize = 134217727
Press any key to continue . . .
Changed the program to use try/catch and got this:
Quote ...
23020000
23030000
Out of memory

// final size of the deque
size = 23031567
Last edited by Ancient Dragon; Aug 7th, 2009 at 11:16 am.
Sponsor
Team Colleague
Featured Poster
Reputation Points: 5608
Solved Threads: 2282
Retired and Enjoying Life
Ancient Dragon is offline Offline
21,950 posts
since Aug 2005
Aug 8th, 2009
0

Re: I ran out of vector space?

Curses! I forgot about the 2GB 32-bit limitation. I don't suppose there is any simple way to reconfigure my compiler (Microsoft Visual C++ 2008 Express Edition) to compile this code in 64-bit mode and thus enable it to access the necessary RAM? Would I have to re-write the code?
Reputation Points: 10
Solved Threads: 0
Light Poster
dzhugashvili is offline Offline
35 posts
since Jun 2009
Aug 8th, 2009
0

Re: I ran out of vector space?

anybody? Compile this code in 64-bit?
Reputation Points: 10
Solved Threads: 0
Light Poster
dzhugashvili is offline Offline
35 posts
since Jun 2009
Aug 10th, 2009
0

Re: I ran out of vector space?

I will start another more appropriately labeled thread.
Reputation Points: 10
Solved Threads: 0
Light Poster
dzhugashvili is offline Offline
35 posts
since Jun 2009
Aug 10th, 2009
0

Re: I ran out of vector space?

Curses! I forgot about the 2GB 32-bit limitation. I don't suppose there is any simple way to reconfigure my compiler (Microsoft Visual C++ 2008 Express Edition) to compile this code in 64-bit mode and thus enable it to access the necessary RAM? Would I have to re-write the code?
You can not configure the Express edition to do that. You will have to buy a pro edition (or maybe standard). You also might want to check out GNU g++ because I think it will compile 64-bit programs.
Sponsor
Team Colleague
Featured Poster
Reputation Points: 5608
Solved Threads: 2282
Retired and Enjoying Life
Ancient Dragon is offline Offline
21,950 posts
since Aug 2005

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in C++ Forum Timeline: file handling
Next Thread in C++ Forum Timeline: Heeeelp!!!!!!!!!!!!!





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC