I am trying to write several different sorting algorithms in order to time the differences between them when reading a file of a half million integers. For testing purposes, I am only using a few hundred integers, but I am getting a stack overflow error. Also, for testing reasons, I have an output file being created during this run so that I can see if the code is working correctly. Originally, I had 3 nested for loops that I believed were causing the errors (and horrible effeciencey), but I cleaned those up and I am still having the error. Can someone look this over and see if there is a glaring issue that I am overlooking. Thanks for any help you can give. I will also upload the .txt file of integers, in case anyone feels up to compiling this to recreate the error. I am using VS2012 Ultimate for my editing/compiling.

#include <iostream>
#include <fstream>
#include <ctime>
#include <string>

void quickSort(int a[], int first, int last);
int pivot(int a[], int first, int last);
void swap(int& a, int& b);

using namespace std;


int main()
{
    const int SIZE = 500000;
    string s[SIZE];
    int test[SIZE];
    int i = 0;
    int N = sizeof(test)/sizeof(int);
    ifstream readFile;
    ofstream myOutputFile;
    int start_s=clock(); //start timer
    readFile.open("inputData.txt");

    if (readFile.is_open()) 
    {
        while (!readFile.eof())
            {
                if (i < SIZE)  //for all values in the inputData read file
                {
                    getline(readFile, s[i]); //store values in a string array
                    test[i] = atoi(s[i].c_str()); //convert string array to integer array
                    i++;
                }

            }
    }

    quickSort(test, 0, SIZE-1);

    readFile.close(); //close file
    int stop_s=clock();//stop timer
    cout <<"time: " << (stop_s-start_s)/double(CLOCKS_PER_SEC)*1000 << endl;//output timer
    myOutputFile.open("sortedValues.txt");
    for (int i =0; i < SIZE; i++)
        myOutputFile << test[i] << endl;
    myOutputFile.close(); 


    return 0;
    system("pause");
}

/*
* Recursive Quicksort algorithm.
*/
void quickSort( int a[], int first, int last )
{
    int pivotElement;

    if(first < last)
    {
        pivotElement = pivot(a, first, last);
        quickSort(a, first, pivotElement-1);
        quickSort(a, pivotElement+1, last);
    }
}

/*
* Find and return the index of pivot element.
*/
int pivot(int a[], int first, int last)
{
    int  p = first;
    int pivotElement = a[first];

    for(int i = first+1 ; i <= last ; i++)
    {
        if(a[i] <= pivotElement)
        {
            p++;
            swap(a[i], a[p]);
        }
    }

    swap(a[p], a[first]);

    return p;
}


/*
* Swap the parameters.
*/
void swap(int& a, int& b)
{
    int temp = a;
    a = b;
    b = temp;
}
Attachments
8809397
5937712
9169212
3467863
5730702
748737
6035700
577496
3601486
4490826
1749210
5058906
8252221
607331
5100676
1061913
3978612
2824658
5741484
9680424
3365318
9972234
416681
6772530
3784927
5887778
3505834
2680408
6751628
8987774
6973525
7067887
3236514
768898
8954683
1557704
2982735
8272984
7046362
1230465
120065
3143176
1359116
9960420
5594205
7364781
7246816
2372009
5828540
1516460
2943665
3329122
9390221
1708526
6331576
1322166
1978493
2860094
117501
8800057
8421458
6486156
6419859
2531200
5422043
5870831
4674182
7945660
845343
6665478
710914
7871180
1321150
7762109
870544
2242657
3030922
3067015
8812653
2358910
1742088
5798765
1380548
975019
7578419
7495995
4291943
1117745
3791120
1767637
4146650
3680590
216646
7904564
773076
7411029
6883510
1891941
8673627
7891139
4760297
9553439
3220248
2868526
1925104
5440249
4755700
9841803
1717189
486849
21454
9483298
8620896
5647129
3715334
3645460
7449117
9226373
3969103
3093610
8075304

Just throwing this out there, this may not be the solution.

In this line you set the following interger to be a random number, where do you get this number from?

const int SIZE = 500000;

But, where you call your quick sort you call the following:

quickSort(test, 0, SIZE-1);

This would therefore assume that the last value is at position 500000 - 1 what if the array contains only 100 values? Would there be a position 499999? This is probably where your code is segmenting. You should also probably set the size of the array, i.e. how many values are inside the array and have this value to be the total number of elements that the array can handle. So for example, I entered your 100 values that you supplied and did the following:

quickSort(test, 0, 99);

Where we can assume that 0 is the first value inside the array and 99 is the last value, this printed out a result. Please see my attatched file.

You should debug your code, find out where it is going wrong.

Good luck :)

Attachments
117501
120065
216646
416681
577496
607331
710914
748737
768898
773076
845343
870544
975019
1061913
1117745
1230465
1321150
1322166
1359116
1380548
1516460
1557704
1708526
1742088
1749210
1767637
1891941
1978493
2242657
2358910
2372009
2531200
2680408
2824658
2860094
2943665
2982735
3030922
3067015
3143176
3236514
3329122
3365318
3467863
3505834
3601486
3680590
3784927
3791120
3978612
4146650
4291943
4490826
4674182
5058906
5100676
5422043
5594205
5730702
5741484
5798765
5828540
5870831
5887778
5937712
6035700
6331576
6419859
6486156
6665478
6751628
6772530
6883510
6973525
7046362
7067887
7246816
7364781
7411029
7495995
7578419
7762109
7871180
7891139
7904564
7945660
8252221
8272984
8421458
8673627
8800057
8809397
8812653
8954683
8987774
9169212
9390221
9680424
9960420

I got the number 500000 because the full file has that many entries in it. I am running it with a much smaller file to ensure that it would work. I adjusted the code to reflect

const int SIZE = 100;

and

quickSort(test, 0, 99);

Then I checked the file and reduced it to exactly 100 integers. The stack overflow is gone, but the program has been running for well over 5 minutes and isn't completed. This seems excessive for 100 numbers.

How long did it take when you were able to successfully run the code?
Also, you said I should debug my code and figure out where it is going wrong. Any suggestions on where I should start?

Could you please provide me with the ACTUAL data values that you have? You can post them onto pastebin

If it's taking a massive amount of time to run, you should look at optimising your algorithm so that it best fits to match your needs. C++ is very quick, however, it does depend on the system and the compiler that you are using and whether or not you are having any memory leaks and/or you are handling things in memory. You may find it a lot easier to read these values inside a 2D array and do your sorting that way.

Because your question was not broad and you were not seeking someone to do the work for you, post your values and I will take a look to try and improve the performance of this for you.

NOTE: Your program should not be running for more than 5 minutes with 100 values. There is something wrong here.

These lines will certainly cause a stack overflow:

const int SIZE = 500000;
string s[SIZE];
int test[SIZE];

This creates (stack-based) static arrays of integers and strings, of 500000 elements. Assuming you run this on a 32bit platform, this will probably take about 10Mb of memory (and probably twice that on a 64bit platform). The stack is a static and limited amount of memory to be used for local variables of functions as they are needed, it is not meant to be used to allocate large arrays like that. Typically, the limit of the stack is between 2Mb and 8Mb, depending on the platform and compiler (the stack size can usually be configured via a compiler option, but the default is usually about 2-8Mb).

For any kind of large arrays, you should allocate them dynamically, even if you know their size at compile-time. So, you can use either these C-style arrays:

int main() {
  const int SIZE = 500000;
  string* s = new string[SIZE];
  int* test = new int[SIZE];

  // the rest of the main() function's code here.

  delete[] s;
  delete[] test;

  system("pause");
  return 0;
};

Or, you can use the following C++-style vectors:

#include <vector>

int main() {
  const int SIZE = 500000;
  vector<string> s(SIZE);
  vector<int> test(SIZE);

  // the rest of the main() function's code here.

  system("pause");
  return 0;
};

The stack overflow is gone, but the program has been running for well over 5 minutes and isn't completed. This seems excessive for 100 numbers.

This indeed very extreme. I made a little program a while ago for fun that tests out a whole suite of classic sorting algorithms on arrays of integers. For 100 integers, the time required was always less than 10 micro-seconds (i.e., 0.00001 seconds), and even InsertionSort (O(N^2)) is below 10 micro-seconds. So, 5 minutes is more than extreme, probably an infinite loop.

Also, if you want to time the algorithm, you should not time the reading and writing to and from the files, because that's going to be far more costly in time than the actual sorting. So, you should have:

readFile.open("inputData.txt");
if (readFile.is_open()) 
{
    while (!readFile.eof())
        {
            if (i < SIZE)  //for all values in the inputData read file
            {
                getline(readFile, s[i]); //store values in a string array
                test[i] = atoi(s[i].c_str()); //convert string array to integer array
                i++;
            }
        }
    readFile.close(); //close file
}

int start_s=clock(); //start timer
quickSort(test, 0, SIZE-1);
int stop_s=clock();//stop timer

I took your code and created this program to test it:

#include <iostream>
#include <cstdlib>
#include <iomanip>
#include <ctime>
#include <vector>

int pivot(int* a, int first, int last)
{
    int  p = first;
    int pivotElement = a[first];
    for(int i = first+1 ; i <= last ; i++)
    {
        if(a[i] <= pivotElement)
        {
            p++;
            std::swap(a[i], a[p]);
        }
    }
    std::swap(a[p], a[first]);
    return p;
}

void quickSort(int* a, int first, int last )
{
    int pivotElement;
    if(first < last)
    {
        pivotElement = pivot(a, first, last);
        quickSort(a, first, pivotElement-1);
        quickSort(a, pivotElement+1, last);
    }
}


int main() {
  std::srand((unsigned int)time(0));

  std::vector<int> arr(100);

  for(int i = 0; i < 100; ++i)
    arr[i] = std::rand() % 1000; // 100 random integers in [ 0, 1000 ) range.

  int start_s = std::clock(); //start timer
  for(int i = 0; i < 1000000; ++i) {  // make 1 million runs.
    std::vector<int> temp = arr;  // copy the array.
    quickSort(&temp[0], 0, 99);   // run quick-sort.
  };
  int stop_s = std::clock();//stop timer
  std::cout <<"time: " << (stop_s-start_s)/double(CLOCKS_PER_SEC) << std::endl;//output timer

  std::cout << "Finished!" << std::endl;
  return 0;
};

With highest optimizations, this program runs in 1 second, without optimizations, it runs in 6 seconds. In other words, the time for sorting a single 100-element random array is a few micro-seconds. Even in the worst case performance O(N^2), the code should still run in the low micro-seconds range.

The problem is that you have an infinite loop in the reading of the data. I would try the following instead:

readFile.open("inputData.txt");
if (readFile.is_open()) 
{
    while ( (!readFile.eof()) && (i < SIZE))   //<--- NOTICE the extra condition here.
        {
            getline(readFile, s[i]);      // store values in a string array
            test[i] = atoi(s[i].c_str()); // convert string array to integer array
            i++;
        }
    readFile.close(); //close file
}

You had an infinite loop when you reached the maximum SIZE before you reached the end of the file. Because of your if-statement, you would never read anything more from the file, but because the end-of-file condition was the condition to end the loop, the loop would never end. So, your 5 minutes running time just means that you've left the program running on an infinite loop for 5 minutes.

This article has been dead for over six months. Start a new discussion instead.