Hi i have c# code for to sort text file using quicksort algo, but the problem is that i can sort only upto 300MB , after that system get hang . i want to sort 1GB to 5GB text file. can anyone plz help me . here is the code

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
 

namespace quik
{
    class Quicksort
    {
        public static void qsort(string[] items)
        {
            qs(items, 0, items.Length - 1);
        }

       
        static void qs(string[] items, int left, int right)
        {
            int i, j;
            string x, y;

            i = left; j = right;
            x = items[(left + right) / 2];

            do
            { 
                while ( (items[i].CompareTo (x)<0) &&(i < right)) i++;
                while ((x.CompareTo(items[j])<0) && (j > left)) j--;

                if (i <= j)
                {
                    y = items[i];
                    items[i] = items[j];
                    items[j] = y;
                    i++; j--;
                }
            } while (i <= j);

            if (left < j) qs(items, left, j);
            if (i < right) qs(items, i, right);
        }
    }

    public class QS
    {
        public static void Main()
        {
           // char[] a = { 'v', 'i', 'r', 'e', 'n', 'd', 'r' ,'a'};
            string[] a = System.IO.File.ReadAllLines(@"C:\\3.txt");
            int i;

            
            //for (i = 0; i < a.Length; i++)
            //    Console.Write(a[i]);

            //Console.WriteLine();

          
            Quicksort.qsort(a);


            using (System.IO.StreamWriter abc = new System.IO.StreamWriter(@"C:\write11.txt"))
            {
                for ( i = 0; i < a.Length; i++)
                {
                    abc.WriteLine("{0}", a[i]);
                }

            }
           
            //for (i = 0; i < a.Length; i++)
            //    Console.Write(a[i]);
        }
    }
}

If there is really alot of data, which will be a time consuming action, you can create a new thread (or use BackGroundWorker, which is a new thread too), and use Sort method:

string [] array = {"some values in array"};
Array.Sort(array);

Hi ,

array.sort method is taking same time as much time the code i have created is taking and array.sort is also hang the system when i sort file more than 300MB. any other way for to sort large file .

If there is really alot of data, which will be a time consuming action, you can create a new thread (or use BackGroundWorker, which is a new thread too), and use Sort method:

string [] array = {"some values in array"};
Array.Sort(array);

Code like Sort() method is completely bound by how fast you can get the data off the disk. The file simply can never fit in the file system cache so you're always waiting on the disk to supply the data. You're doing fairly well at 10 MB/sec, optimizing the code is never going to have a discernible effect.

Get a faster disk. Defrag the one you've got as an intermediate step.
And its the fault of yout system if it hangs up. It shouldnt.

-------------------------------------------------

There is another solution:
Short answer - load the data into a relational database eg Sql Express, create an index, and use a cursor based solution eg DataReader to read each record off and write it to disk.

-------------------------------------------------

If this would help you out you go ahead and do your own custom sorting algoritm. For info you can go HERE. But remember, you have to be good at math ;)

Hi,

did you try to split data in smaller parts (string.split method) and then merging them. The smaller data packets will fit into your cache and will make it faster to process.

Just give it a try.

Tony

Hi ,

Thanks for your suggestion , but problem is that if i divide data in chunks still in the end i have to merge all chunks and then again need to do sort , so it will not work .:(

what sort of data are you attempting to sort? you mentioned 5GB of data, and it seems you are reading all lines from a text file.. with a file of that size you will consume way to much memory as all file will have to be stored in memory. Reading all data, sorting and writing will consume too much time and resources.

If the data allows, can't you store it in database and use simple sql to sort? dbs use their physical storage to sort via indexes and so on.. what you are doing is reading everything in memory, works fine with a few hundred MBs but becomes cumbersome with GBs of data.

There are plenty of data structures useful for sorting, but the problem is that you are trying to store the entire file in memory. Obviously a solution is to not store the entire file in memory. It can be done.

This question has already been answered. Start a new discussion instead.