I want to generate 1billion random integer with no duplication, I dont know how do I allocate memory for such a huge amount of data.
jgehlot09
- 3 Contributors
- forum3 Replies
- 5 Views
- 8 Years Discussion Span
- comment Latest Post by Adak
Adak 419
I want to generate 1billion random integer with no duplication, I dont know how do I allocate memory for such a huge amount of data.
This is a trick question, isn't it?
You've got a list of a billion unique random int's, stuffed in your back pocket, don'tcha?
You're trying to make us look silly, aren't you?
Well, as long as we're clear about that! ;)
If your random numbers are all positive (you didn't think of that, I bet!), then check your compiler for it's largest value of a long (or long long) unsigned int.
It may be in <limits.h> or <ctype.h>, but you can usually print it out with the macro MAX_LONGU, MAX_ULONG, or something like that.
That quantity will be the maximum amount you can (hope) to send out at a time, with this logic. You can do more using big char arrays of digits or big Number libraries.
I will stick to using neither, here.
So you've got your biggest integer. Now on your system, experiment a bit, and find out what is the largest array of TWO, of those big int's, that you can create.
One array will be an index, and the second will be the actual numbers to be sent out.
After generating the actual array full of random numbers, then initialize the index array, like so:
for(i = 0; i < MAX; i++)
index[i] = i;
Then we'll sort the index, by comparing numbers in the actual numbers array, with Quicksort. No actual numbers will be moved, but the index values will be adjusted so the actual numbers can be looked at, using the index values, in sorted order.
Don't get glassy eyed on me, it's really slick! ;)
Duplicates can then be detected, and set to a negative 1, so they won't be sent out, by the output function.
Then the output function sends them out, and counts how many actually were sent. Those numbers are also written to a data file, using the index values, so the numbers are in sorted order.
Repeat, but after the first send out, you also have to compare all the number in the data file, as well, to test for duplicates.
Perhaps a simpler way:
Generate a file which will hold just sequential numbers, from your lowest random number possible, to the highest possible for your range.
Now choose a random number position from that file, and send it. Say the first number was #100. Send the 100.
Now move all the numbers up one row in the file, so space #100 becomes #101, see?
Now generate another random number, and go to that row number in the file, and send it. Move all the higher valued numbers up one row.
a[0] = 0
a[1] = 1
a[2] = 2
one is randomly chosen, so a[] is now:
a[0] = 0
a[1] = 2
a[2] = -1 //a sentinel value of some sort
This is very simple, but the run time for a billion will not be charitable :eek: , with all that shuffling of data. It may be tolerable with a RAM disk type set up, but a hard drive (or a Thumb drive!), will really have to work hard and slow things down, especially a Thumb drive).
The largest array value, should be shifted to this spot, instead of shuffling all the values, of course!
I believe that makes this simpler method, the method of choice.
I see another way to do this with an array and using 1's and 0's as "pseudo bit" values, but it doesn't seem able to handle the non-repeating and random feature, any better.
Maybe out there, though.
What tips did your instructor offer, (must have been something), and is it true that he hides a pair of horns and has a pointed tail?
Edited
by Adak: n/a
brotherdeez
#include <cstdlib>
#include <ctime>
long double count;
unsigned seed;
srand(seed);
count = 1 + rand() % 1000000000;
Adak 419
Well, there is an easier and more efficient way to generate non-repeating random numbers. The algorithm is:
generate all the (non random) sequential numbers you need, and put them into an array (any data structure would perhaps do, but an array is easy).
Then shuffle the numbers in the array, randomly. Aha!
Then take the randomized numbers, in order, by their index: num[index].
Since all the numbers were unique to start with, and you've shuffled them around the array, in a random fashion, you're guaranteed a unique (pseudo) random number.
Most random numbers generated by non-professionals, are skewed in some way (which may not matter a bit to you and your program - you may not even notice it). There is a real art to generating good (pseudo) random numbers, however - not for beginners.