954,499 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Quicksort/Insertion sort Hyrbid?

Recently we have been asked to create a hybrid sort based upon quicksorting down to a certain point and insertion sorting from then on. We are to calculate the efficency based upon previous tests. I had very little problems implementing a quicksort algorithm, but for the life of me, I can't get anything to work on the hybrid version. Can anyone help, provide tips or pointers?

Jpowers22
Newbie Poster
5 posts since Oct 2004
Reputation Points: 12
Solved Threads: 0
 
Recently we have been asked to create a hybrid sort based upon quicksorting down to a certain point and insertion sorting from then on. We are to calculate the efficency based upon previous tests. I had very little problems implementing a quicksort algorithm, but for the life of me, I can't get anything to work on the hybrid version. Can anyone help, provide tips or pointers?

I'd love to help but could you please be more informative? What do you mean by "I can't get anything to work"? How many passes of quicksort do you do before you do insertion sort?

data := "5 2 4 7 3 1 6 8"
left := 0
right := 7
pivot := data[0] = 5

partition(data, left, right, pivot)

data := "2 4 3 1 5 7 6 8"
left := 0
right := 7
pivot : = data[4] = 5

quick_sort(data, left, right, pivot) {
  if left < right { 
    if(threshold(...)) {
      insertion_sort(data)  
    } else {
      pivot := partition(data, left, right, pivot)
      quicksort(data, left, pivot - 1)
      quicksort(data, pivot + 1, right)
    }
  }
}

insertion_sort(reference pointer to data) {
 ....
}

Sorry, I have to get back to work. If you could please post code, I can take a look later and help you out more.

subtronic
Junior Poster
117 posts since Aug 2003
Reputation Points: 44
Solved Threads: 1
 

I always find it instructive to see what others have done, and then apply my own style and ideas to the problem. In the case of a Quicksort/Insertion sort hybrid, you could look in Google for some other folk's source code, or if you have VC++, you can look at the source code for their 'qsort()' function. It is not recursive and therefore a bit ugly, but it works great. They've got tuning variables in there for figuring out at what point the insertion sort kicks in.

Like I say, don't copy their code but learn from it.

Chainsaw
Posting Pro in Training
436 posts since Jun 2004
Reputation Points: 36
Solved Threads: 11
 

void quickSort(int numbers[], int array_size)
{
q_sort(numbers, 0, array_size - 1);
}


void q_sort(int numbers[], int left, int right)
{
int pivot, l_hold, r_hold;

l_hold = left;
r_hold = right;
pivot = numbers[left];
while (left < right)
{
while ((numbers[right] >= pivot) && (left < right))
right--;
if (left != right)
{
numbers[left] = numbers[right];
left++;
}
while ((numbers[left] <= pivot) && (left < right))
left++;
if (left != right)
{
numbers[right] = numbers[left];
right--;
}
}
numbers[left] = pivot;
pivot = left;
left = l_hold;
right = r_hold;
if (left < pivot)
q_sort(numbers, left, pivot-1);
if (right > pivot)
q_sort(numbers, pivot+1, right);
}

This is code I used for Left most sorting an array of randomly generated data integer elements.

Basically I want, the program to recurse to the point where insertion sort is more optimal than quicksort...and I honestly don't even have a clue where to start.

Jpowers22
Newbie Poster
5 posts since Oct 2004
Reputation Points: 12
Solved Threads: 0
 

Assuming you have an insertion sort routine, try this modification to your code:

void q_sort(int numbers[], int left, int right)
{
    int pivot, l_hold, r_hold;

        // If there are few records left to look at, use an insertion sort
        //   because it is faster.
    if ((right - left) < INSERTION_SORT_THRESHOLD)
    {    
        InsertionSort( numbers, left, right );
        return;
    }
    l_hold = left;
    r_hold = right;
. . .


and then just tinker with INSERTION_SORT_THRESHOLD's value to see what happens. For example, make it zero to disable the insertion sort, make it huge to ONLY do an insertion sort, and then try something like 5 or 10.

On Windows, you can use GetTickCount() to return the time since boot in miliseconds (mod 32 bits), and that can make for a simple elapsed time counter:

void quickSort(int numbers[], int array_size)
{
    DWORD startTime = GetTickCount();
    q_sort(numbers, 0, array_size - 1);
    printf( "Elapsed time:%d ms\n", GetTickCount() - startTime );
}


This isn't perfect for production code because the tick count will flip over eventually, but for your testing on where to put the insertion sort threshold this should do fine.

Chainsaw
Posting Pro in Training
436 posts since Jun 2004
Reputation Points: 36
Solved Threads: 11
 

>I can't get anything to work on the hybrid version. Can anyone help, provide tips or pointers?
Oddly enough, I posted the full implementation for an optimized quicksort in the code snippets on this site. One of the optimizations is terminating recursion on small subfiles and then calling insertion sort on the "almost" sorted file. An alternative is to use insertion sort during recursion, but that typically has less desirable properties than waiting until after the recursive steps. Because you're testing performance, it couldn't hurt to implement both and see what you get. :)

I also give examples of two other improvements to the basic algorithm, one common and the other not so common because it's rather difficult and not well known.

>to recurse to the point where insertion sort is more optimal than quicksort
This really needs to be tweaked by the implementor, but a cutoff somewhere between 5 and 25 will work nicely in general and that's where you usually find yourself setting it.

Narue
Bad Cop
Administrator
15,460 posts since Sep 2004
Reputation Points: 6,464
Solved Threads: 1,401
 

One interesting way to figure out optimizations is through experimentation. You can time each pass for each implementation, record results and then plot them side by side on a graph. You can make adjustments accordingly to the insertion sort version's threshold to see if there's improvements.

subtronic
Junior Poster
117 posts since Aug 2003
Reputation Points: 44
Solved Threads: 1
 
One interesting way to figure out optimizations is through experimentation. You can time each pass for each implementation, record results and then plot them side by side on a graph. You can make adjustments accordingly to the insertion sort version's threshold to see if there's improvements.


Thats what I did, I found that the threshold value for the size of arrays I was dealing with would show increased efficiency at around 25 and lower.

Jpowers22
Newbie Poster
5 posts since Oct 2004
Reputation Points: 12
Solved Threads: 0
 

>I can't get anything to work on the hybrid version. Can anyone help, provide tips or pointers? Oddly enough, I posted the full implementation for an optimized quicksort in the code snippets on this site. One of the optimizations is terminating recursion on small subfiles and then calling insertion sort on the "almost" sorted file. An alternative is to use insertion sort during recursion, but that typically has less desirable properties than waiting until after the recursive steps. Because you're testing performance, it couldn't hurt to implement both and see what you get. :)

I also give examples of two other improvements to the basic algorithm, one common and the other not so common because it's rather difficult and not well known.

>to recurse to the point where insertion sort is more optimal than quicksort This really needs to be tweaked by the implementor, but a cutoff somewhere between 5 and 25 will work nicely in general and that's where you usually find yourself setting it.

Narue,
I checked your excellent sorting snippet, and have a question. Is there a way to time this, milliseconds is much to slow? I know one could increase the size of the array, but that's not challenging. Something like a microsecond or nanosecond timer would help.

vegaseat
DaniWeb's Hypocrite
Moderator
5,989 posts since Oct 2004
Reputation Points: 1,345
Solved Threads: 1,417
 
Narue, I checked your excellent sorting snippet, and have a question. Is there a way to time this, milliseconds is much to slow? I know one could increase the size of the array, but that's not challenging. Something like a microsecond or nanosecond timer would help.

struct timeval

subtronic
Junior Poster
117 posts since Aug 2003
Reputation Points: 44
Solved Threads: 1
 

>I know one could increase the size of the array, but that's not challenging
Not challenging, but certainly informative when comparing methods. Because quicksort's speed isn't noticeable until the array gets very large, you won't see much difference between sorting methods no matter what kind resolution your timer is. It's best to use an array size that would be realistic for your application and span the timing across multiple calls so that you can get an average time distribution for the algorithm.

And if you're using my quicksort for 1000 items or less, that's what we in the field call "overkill". Try shellsort instead. :D

Narue
Bad Cop
Administrator
15,460 posts since Sep 2004
Reputation Points: 6,464
Solved Threads: 1,401
 

Since you asked, Windows and other OS' may have a higher resolution timer. On Windows, check out:

EngQueryPerformanceCounter()
and
EngQueryPerformanceFrequency()

The other issue about performance is: how much work do you do in your compare proc? I.e. if the compare is trivial, like an int compare, then what you are timing is the speed of your algorithm, and that's where simpler sorts can shine.

If your compare is complicated, or slow (read two records and compare some field, or do a international string compare on similar strings, ...), then the complexity of the sort algorithm is dwarfed by the compare time, and THAT's where all this fancy hybrid stuff will pay off.

Chainsaw
Posting Pro in Training
436 posts since Jun 2004
Reputation Points: 36
Solved Threads: 11
 

>I know one could increase the size of the array, but that's not challenging Not challenging, but certainly informative when comparing methods. Because quicksort's speed isn't noticeable until the array gets very large, you won't see much difference between sorting methods no matter what kind resolution your timer is. It's best to use an array size that would be realistic for your application and span the timing across multiple calls so that you can get an average time distribution for the algorithm.

And if you're using my quicksort for 1000 items or less, that's what we in the field call "overkill". Try shellsort instead. :D

Also, it's good to input different values of N (a scale of values from low to high) to prove the running time is correct. The improvement would be be seen over a log / N graph. You should also do the best and worse cases for the quicksort as well (with both implementations). If you're using *NIX like any respectable code monkey (hehe, jk),

#include and check out struct timeval

subtronic
Junior Poster
117 posts since Aug 2003
Reputation Points: 44
Solved Threads: 1
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You