Sorting huge amount of numbers

Please support our C++ advertiser: Intel Parallel Studio Home
Reply

Join Date: Jun 2008
Posts: 3
Reputation: cecyl is an unknown quantity at this point 
Solved Threads: 0
cecyl cecyl is offline Offline
Newbie Poster

Sorting huge amount of numbers

 
0
  #1
Jun 15th, 2008
Hello All,

Please, advice some as quick as posible algorithm to sort file containing about 500 000 integers. Roughly 30% of them have value 1, 20% are 2, 10% 3......0,1% are 1 000

Thanks much,
C
Reply With Quote Quick reply to this message  
Join Date: Dec 2005
Posts: 5,850
Reputation: Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute 
Solved Threads: 751
Team Colleague
Salem's Avatar
Salem Salem is offline Offline
Void main'ers are DOOMed

Re: Sorting huge amount of numbers

 
0
  #2
Jun 15th, 2008
Well there is std::sort , which will work on a std::vector of integers.

If you know what you're doing, it would take about 10 minutes to write - is that quick enough?

Or how about using your command line sort program
sort -k1n -o result.txt input.txt
Reply With Quote Quick reply to this message  
Join Date: Jun 2008
Posts: 973
Reputation: Alex Edwards is a jewel in the rough Alex Edwards is a jewel in the rough Alex Edwards is a jewel in the rough Alex Edwards is a jewel in the rough 
Solved Threads: 107
Alex Edwards's Avatar
Alex Edwards Alex Edwards is offline Offline
Posting Shark

Re: Sorting huge amount of numbers

 
0
  #3
Jun 15th, 2008
Gotta love the SWAP macro too...

  1.  
  2. #define SWAP(a, b) __typeof__(a) temp; temp = a; a = b; b = temp
Reply With Quote Quick reply to this message  
Join Date: May 2008
Posts: 351
Reputation: Radical Edward has a spectacular aura about Radical Edward has a spectacular aura about Radical Edward has a spectacular aura about 
Solved Threads: 62
Radical Edward's Avatar
Radical Edward Radical Edward is offline Offline
Posting Whiz

Re: Sorting huge amount of numbers

 
0
  #4
Jun 16th, 2008
> Gotta love the SWAP macro too...
Or better yet, the STL swap function from <algorithm>.
If at first you don't succeed, keep on sucking until you do succeed.
Reply With Quote Quick reply to this message  
Join Date: Oct 2007
Posts: 1,953
Reputation: Duoas has much to be proud of Duoas has much to be proud of Duoas has much to be proud of Duoas has much to be proud of Duoas has much to be proud of Duoas has much to be proud of Duoas has much to be proud of Duoas has much to be proud of 
Solved Threads: 214
Featured Poster
Duoas's Avatar
Duoas Duoas is offline Offline
Posting Virtuoso

Re: Sorting huge amount of numbers

 
0
  #5
Jun 16th, 2008
That macro invites death without warning.

If you want to swap something, #include <algorithm> and use std::swap().

@cecyl
Generic algorithms will work just fine, but if you know something about the probable distribution of your data then you can reduce sort time considerably.

If you know that you have a relatively small set of values, then you can reduce your sort time to constant or near-constant time by simply counting how many times each value appears, then writing them back out in order.

Good luck.

[edit] Man, get up to visit the bathroom and someone smart answers first...
Last edited by Duoas; Jun 16th, 2008 at 12:07 pm.
Reply With Quote Quick reply to this message  
Join Date: Jun 2008
Posts: 3
Reputation: cecyl is an unknown quantity at this point 
Solved Threads: 0
cecyl cecyl is offline Offline
Newbie Poster

Re: Sorting huge amount of numbers

 
0
  #6
Jun 16th, 2008
Thank you all for advices. I have to sort it much, much more quicker, are 10 seconds possible?. Can it be quicker using some low-level code like assembler,....?
Reply With Quote Quick reply to this message  
Join Date: Jun 2008
Posts: 3
Reputation: cecyl is an unknown quantity at this point 
Solved Threads: 0
cecyl cecyl is offline Offline
Newbie Poster

Re: Sorting huge amount of numbers

 
0
  #7
Jun 16th, 2008
ooops, I have understood one advice wrong, I thougt it should take 10 minutes to sort, ...but it takes 10 min to write the code (not for me).
Going to bathroom is good idea as well
Reply With Quote Quick reply to this message  
Join Date: Dec 2006
Posts: 1,089
Reputation: vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all 
Solved Threads: 164
vijayan121 vijayan121 is offline Offline
Veteran Poster

Re: Sorting huge amount of numbers

 
0
  #8
Jun 16th, 2008
> I have to sort it much, much more quicker, are 10 seconds possible?.
with 10 seconds at your disposal, you can sort it any which way you like.
the fastest way would perhaps be to use std::partition to move the most frequently occurring values (which are also the smallest values) to the front of the sequence. sort the remaining 40% or so of the sequence using a std::sort .
it doesn't really matter if you are sorting a file; i/o would dominate.

here are some measurements of a sample run on a sequence of 500000 integers (roughly 30% of them have value 1, 20% are 2, 10% 3, remainder are [3,1000] ):
a. std::sort on the sequence: 0.0390625 seconds
b. std::partition for 1, 2 and 3, std::sort for the rest: 0.0234375 seconds
c. same as b, but read from a file sort and write to another file: 0.335938 seconds
  1. >dmesg | grep CPU:
  2. CPU: Intel(R) Pentium(R) 4 CPU 2.80GHz (2800.10-MHz 686-class CPU)
perhaps, Salem's suggestion of using a command line sort program is all that you need.
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Other Threads in the C++ Forum


Views: 698 | Replies: 7
Thread Tools Search this Thread



Tag cloud for C++
About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC