954,499 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

How to use VERY Long STL Vectors?

How can I use very long STL vectors? I need to write a code where I use the vector container to store a very large number of values, and I need to access the elements using [i]. I am currently using [i], where i is an integer. How can do the same if I have more elements than the largest integers (which in my system is 2^31)? For example in the sample "code" below, how can access myvec[i] if "i" is supposed to be larger than 2^31?

Thanks,

vector<double> myvec;

  while (Not Full)  // add elements until decides to stop
  {
      double x = some value;  // here there is a function to compute x
      myvec.push_back(x);
  }

  int size = myvec.size();
  for (int i=0; i < size; ++i)
  {
      cout << myvec[i] << endl;
  }
maru2
Newbie Poster
5 posts since Mar 2009
Reputation Points: 10
Solved Threads: 0
 

I think you're asking the wrong question. The right question would be (assuming 8 byte doubles), why are you trying to eat over 16 gigabytes for a single vector? The naive approach isn't likely to work in this case, you need to consider alternative algorithms that don't waste so much memory.

Narue
Bad Cop
Administrator
15,460 posts since Sep 2004
Reputation Points: 6,464
Solved Threads: 1,401
 

I wonder if you really need to store that much data? Do you expect to populate just a few values within that range? It's unlikely you'd have the memory capacity to support a fully populated vector that large, so perhaps you should consider other ways to store the data.
Maps, database...

MrSpigot
Junior Poster
158 posts since Mar 2009
Reputation Points: 76
Solved Threads: 40
 

I really need to store all that data. I need to compute and store all those values. Memory is not a problem, I have access to computers with 256GB of memory or more.

I could use different data structures to store the data, but it would make my algorithms much slower since I take advantage of the fact that I can access the i-th element in constant time. I think I really want a vector. Is there any other data structure that allows me to access the i-th element in constant time?

Thanks,

maru2
Newbie Poster
5 posts since Mar 2009
Reputation Points: 10
Solved Threads: 0
 

Give us a clue - just how big do you want to go?
I wonder what platform you're using - it seems that you need access to 64 bit indexes.

MrSpigot
Junior Poster
158 posts since Mar 2009
Reputation Points: 76
Solved Threads: 40
 

>I really need to store all that data.
Every time I hear somebody say that, they turn out to be wrong. :icon_rolleyes:

>I think I really want a vector.
No, you want multiple vectors. If you're hitting the memory limit of a single vector, it's only a minor inconvenience to extend that limit with a vector of vectors.

Narue
Bad Cop
Administrator
15,460 posts since Sep 2004
Reputation Points: 6,464
Solved Threads: 1,401
 

you obviously are not using MS-Windows or *nix operating systems, because the max amount of ram supported by 64-bit MS-Windows is 128 gig.

one way to do what you want is to store the data in two or more arrays and use one of the arrays depending on the value of i.

Ancient Dragon
Retired & Loving It
Team Colleague
30,049 posts since Aug 2005
Reputation Points: 5,662
Solved Threads: 2,343
 

I run my code on linux machines (64bits), but and int is still 32bits. long int is sometimes still 32 bits. So just using long int won't do any good.

For the moment 64bits indices would solve the problem, but I want a solution that would work even if I want to go past 64bits.

Is there and integer type that is guaranteed to be 64bits?

I thought about using multiple vectors instead of one, but them I still need an index to know which element I need to access. When I talk about the i-th element, what would i be?

Thanks

maru2
Newbie Poster
5 posts since Mar 2009
Reputation Points: 10
Solved Threads: 0
 

>When I talk about the i-th element, what would i be?
I'd probably use two indices where a simple calculation represents your value's index. So if you divide your vectors into 100 element blocks and there are ten of them, the item at "index" 356 would be located at v[3][56] in the actual vector (conceptually 3 * 100 + 56).

Narue
Bad Cop
Administrator
15,460 posts since Sep 2004
Reputation Points: 6,464
Solved Threads: 1,401
 

If you are using 64 linux,
then your long ints are 8 bytes.

Not sometimes 4,5,6,7 but always 8.
The max mapped memory for 32bit is 2^32 around 4 gig
The max mapped memory for 64bit is 2^64 around 1.6 million terabytes (this means alot!).

The unsigned integer range for 32 bit is 0 to +4,294,967,295
The unsigned integer range for 64 bit is 0 to +18,446,744,073,709,551,615

This roughly means that
if you can have it in memory, then you can have it in a vector.

Now what kind of algorithm are you using, to solve what kind of problem.

I've been using Vector to store 36 gig of memory, in some gene expression statiscal problem.
But in the end I ended up using standard **char, since I couldn't .reserve() because I didn't have knowlegde at the beginning of the program.

monkey_king
Junior Poster
160 posts since Aug 2008
Reputation Points: 70
Solved Threads: 9
 

>>if you can have it in memory, then you can have it in a vector.

No it doesn't. If he is still using 32-bit compiler than the size of the long int is not changed from what it is on a 32-bit operating system, and the STL libraries will still be 32-bit libraries. He will need to use a 64-bit compiler to make any use of the 64-bit os.

Ancient Dragon
Retired & Loving It
Team Colleague
30,049 posts since Aug 2005
Reputation Points: 5,662
Solved Threads: 2,343
 

>>if you can have it in memory, then you can have it in a vector.

No it doesn't. If he is still using 32-bit compiler than the size of the long int is not changed from what it is on a 32-bit operating system, and the STL libraries will still be 32-bit libraries. He will need to use a 64-bit compiler to make any use of the 64-bit os.


Well the poster says he is using a 64bit linux,
so I found it safe to assume he is using a 64bit toolchain.

But your argument is valid. But for sake of completeness, the same can be said about a 16 bit compiler :)

monkey_king
Junior Poster
160 posts since Aug 2008
Reputation Points: 70
Solved Threads: 9
 

If I use a 64-bit compiler than a [icode]long int[\icode] is always 8 bytes? Then this basically solves my problem. I didn't know this was always true. Sometimes I see 4 bytes long ints, then maybe they have been compiled with a 32-bits compiler.

So I will just use long ints for my indices, and this will make it possible to address all the memory I have. I will also use multiple vectors for speed. It takes a long time to resize the vectors once they are big.

Thanks for all you replies.

One more question: If v is vector, what is the type of v.size()? Is it an long int?

Thanks.

maru2
Newbie Poster
5 posts since Mar 2009
Reputation Points: 10
Solved Threads: 0
 

It's so simple (no need in DaniWeb;)), ask your compiler:

typedef std::vector<double> DblVector;
    typedef std::vector<int>    IntVector;
    DblVector dv;
    IntVector iv;
    cout << sizeof(DblVector::size_type) << '\n';
    cout << dv.max_size() << '\n';
    cout << iv.max_size() << endl;
ArkM
Postaholic
2,001 posts since Jul 2008
Reputation Points: 1,234
Solved Threads: 348
 

If you are using that much memory, you will be much better off using the boost array/multi_array classes. Then you will be better off using Blitz.

Word of warning: write the code using boost/stl untill it works with smaller data sets. THEN use Blitz. Otherwize the error messages and other problems are incomprehensible.

StuXYZ
Practically a Master Poster
680 posts since Nov 2008
Reputation Points: 760
Solved Threads: 138
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You