I have to divide dataset into training dataset(80%) and test dataset (20%) using random sampling.I have to implement it in c#.If anyone knows,please help me.....

Recommended Answers

All 29 Replies

Use C# forum for C# questions.

I have to divide dataset into training dataset(80%) and test dataset (20%) using random sampling.

What do you mean by dividing?

for example,if there are 100 rows in a data set,then we have to divide it separately into 80 and 20 rows by using random sampling....
ie partition it into 2 separate data sets which are subsets of original data set

1st of all, dataSet does NOT have any rows. DataSet is only a collection of dataTables.
DataTable is the object which has rows - just to make clear.

Then to your issue: use a random class to get a random numbers, and this number you then use to get that many rows from a dataTable, and put then into a new dataTable (the selected one delete). This way you will get two dataTables - and now you can put then both into a dataSet.

Understood?

1st of all, dataSet does NOT have any rows. DataSet is only a collection of dataTables.

I believe he's using 'data set' in the mathmatical meaning, not DataSet the class :)

yes...this datset i mean not the asp.net dataset...mathematical datase ie simply collection of rows about some information.

I think i can take this dataset into asp.net data table and do make the class for generating random numbers,but i have no idea how to create that class...I am a beginner only...can please explain it how to do...it will be very kind of you...

Im talking about C#`s class: Sytem.Data.DataTable;

I dont know what kind of collection you use, but you should use one of the C# has. Or at least convert to it. And you dont need any class to create.

So, can you show us the code, maybe we can figure something out?

i am taking my data through excel.After uploading ,i am binding it with asp.net data table ie system.data.datatable as you said.

I didn't write any code for this partition...Now what I am having is data table with my uploaded data.I have to partition it into two as i said earlier-80% and 20%.

I have no idea to start from where...please guide me to achieve it

Ok, because Im so good and willing to help yet, I did the code for you:

Random r = new Random();
        private void Method()
        {
            DataTable table1 = new DataTable("table1");
            DataTable table2 = new DataTable("table2");

            //I will use only one column, but the same is if you use more!
            table1.Columns.Add("column 1", typeof(int));
            
            //create same column names as table1
            table2 = table1.Clone(); 

            //lets add some rows:
            for (int i = 0; i < 100; i++)
                table1.Rows.Add(i);

            //generate a random number:
            int myRandom = r.Next(table1.Rows.Count);

           
            //now lets get that random number of rows from table1 to table2 - from top down, and remove all added to table2:            
            //creating listT to add rows (becuase you cannot remove or delete rows in a loop)!
            List<DataRow> rowsToRemove = new List<DataRow>();
            for (int i = 0; i < table1.Rows.Count; i++)
            {
                if (i <= myRandom)
                {
                    rowsToRemove.Add(table1.Rows[i]);
                    table2.Rows.Add();
                    foreach (DataColumn col in table1.Columns)
                    {
                        table2.Rows[i][col.ColumnName] = table1.Rows[i][col.ColumnName];
                    }
                }
                else
                    break;
            }

            //remove rows from table1:
            foreach (var dr in rowsToRemove)
                table1.Rows.Remove(dr);

            // NOW YOU HAVE:
            // In table1 rows from random number and up to 100
            // In table2 rows from 0 to the randomnumber (including it)!
        }

This is working perfectly good, I have tested it!!

yes sir,it is working perfectly...you are really helpful...thank you so much...can you please suggest me one more thing.

here we are generating a random number less than data table row count.I want its value such that i can get 80% of row count in one table and 20% in the other table.

while assigning value to that myrandom, can i assign like this....

Change line 18 ( int myRandom = r.Next(table1.Rows.Count); to the number of rows you need in one table and you should end up with one take with one percentage, the other with the rest. For example, you'd change the line to int myRandom = (int)(table1.Rows.Count * .80); // Multiply by 80%

sir,I have an idea...suppose there are 100 rows in atable.so i will choose 20 random numbers less than 100 using for loop and the row numbers corresponding to these 20 random nubers can be added to the second table and later we can remove all those 20 rows from first table...

I tried coding it.but it is not removing any rows from first table
below is the code i tried;

List<DataRow> rowsToRemove = new List<DataRow>();
int max = samples.Rows.Count * 20 / 100;

for (int d = 0; d < max; d++)
{
    int myRandom = r.Next(samples.Rows.Count);

    for (int i = 0; i < samples.Rows.Count; i++)
    {
        if (i == myRandom)
        {
            rowsToRemove.Add(samples.Rows[i]);
            table2.Rows.Add();
            foreach (DataColumn col in samples.Columns)
            {
                table2.Rows[i][col.ColumnName] = samples.Rows[i][col.ColumnName];
            }
        }
        else
            break;
    }

   
}
foreach (var dr in rowsToRemove)
    samples.Rows.Remove(dr);

You mean you wanna get 20% of the nummbers (in case we have 100 numberss this means 20 numbers as well).

But not from 1 to 20, but randomly picked up from 1 to 100 (20 numbers somewhere in between)?

yes sir...you got it....that one i want..

You can doit this way:

Random r = new Random();

//in your class:

// 1. Generate those 20 numbers (or what ever lenght) youz want:

int allRows = table1.Rows.Count; //total rows of table one
int myPick = 20; // this is your variable of how many rows random numbers to get!

List<int> myRandoms = GenerateRandomNumbers(allRows, myPick);
List<DataRow> rowsToRemove = new List<DataRow>();
for (int i = 0; i < table1.Rows.Count; i++)
{
      if (myRandoms.Contains(i)) //this will check for the correct row to select into table 2!
      {
            rowsToRemove.Add(table1.Rows[i]);
            table2.Rows.Add();
            foreach (DataColumn col in table1.Columns)
            {
                   table2.Rows[i][col.ColumnName] = table1.Rows[i][col.ColumnName];
             }
        }
         else
               break;
}


//new method to generate random numbers:
private List<int> GenerateRandomNumbers(int maxValue, int NumbersToGet)
{
      List<int> list = new List<int>();
      int number;
      for(int i = 0; i <= NumbersToGet; i++)
     {
           do
           {
                  number = r.Next(maxValue);
           }
           while(!list.Contains(number));
           list.Add(number);
      }
      return list;
}

Include my code too. This is only the main part which was changed.

One thing to mention: the code I pasted above is written by heart - so there can be some errors, but I hope there are not :)!

Thank you sir,let me try it and surely i will let you know about the result

One thing to mention: the code I pasted above is written by heart - so there can be some errors, but I hope there are not :)!

it is not coming.I bound the two resulting tables with 2 gridview.But that page is not getting loaded(but no errors are shown).I waited for about 20 minutes for getting that page loaded.but it is not coming...

so i tried the earlier once again,but it is working...problem is for the modified one...

??
Do you know how to use debugger with break point? I hope so, so use it. Go through the code line by line to see where is the problem.

Ok, here we go...
I did the whole project for you. I will upload the solution in here, but you can download the full project HERE, so you can test it out. I hope I included all - most of the things I though they are important.

Code:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.IO;

namespace Jul25_1
{
    public partial class Form1 : Form
    {
        Random r = new Random();
        DataTable table1;
        DataTable table2;
        List<DataRow> rowsToRemove;

        public Form1()
        {
            InitializeComponent();            
        }

        private void buttonNewShuffle_Click(object sender, EventArgs e)
        {
            Get_100_RandomNumbes();
        }

        private void buttonGet20_Click(object sender, EventArgs e)
        {
            Get_20_SelectedNumbers();
        }

        private void buttonDelete_Click(object sender, EventArgs e)
        {
            DeleteRowsFromTable1();
        }
       
        private void Get_100_RandomNumbes()
        {
            table1 = new DataTable("table1");
            listBox2.DataSource = null;
            listBox2.Items.Clear();
            table2 = null;

            //I will use only one column, but the same is if you use more!
            table1.Columns.Add("Column 1", typeof(int));          

            //lets add some rows:
            for (int i = 0; i < 100; i++)
                table1.Rows.Add(i);
            
            //set data to listBox1:
            listBox1.DataSource = new BindingSource(table1, null);
            listBox1.DisplayMember = "Column 1";
        }

        private void Get_20_SelectedNumbers()
        {
            if (table1 != null)
            {
                if (table1.Rows.Count == 100)
                {
                    table2 = new DataTable("table2");

                    //create same column names as table1
                    table2 = table1.Clone();

                    int allRows = table1.Rows.Count; //total rows of table one
                    int myPick = 20;//You can use : r.Next(table1.Rows.Count); // this is your variable of how many rows random numbers to get!

                    List<int> myRandoms = GenerateRandomNumbers(allRows, myPick);

                    rowsToRemove = new List<DataRow>();
                    for (int i = 0; i < table1.Rows.Count; i++)
                    {
                        if (myRandoms.Contains(i)) //this will check for the correct row to select into table 2!
                        {
                            rowsToRemove.Add(table1.Rows[i]);
                            table2.Rows.Add();
                            foreach (DataColumn col in table1.Columns)
                            {
                                table2.Rows[table2.Rows.Count - 1][col.ColumnName] = table1.Rows[i][col.ColumnName];
                            }
                        }
                    }

                    //set data to listBox2:
                    listBox2.DataSource = new BindingSource(table2, null);
                    listBox2.DisplayMember = "Column 1";
                }
                else
                    MessageBox.Show("You have already did the selection of 20 numbers. Now you can shuffle again.");
            }
            else
                MessageBox.Show("Please shuffle new numbers 1st.");
        }

        private List<int> GenerateRandomNumbers(int maxValue, int NumbersToGet)
        {
            List<int> list = new List<int>();
            int number = 0;
            for (int i = 0; i < NumbersToGet; i++)
            {
                do
                {
                    number = r.Next(maxValue);
                }
                while (list.Contains(number));
                list.Add(number);
            }
            return list;
        }

        private void DeleteRowsFromTable1()
        {
            //remove rows from table1:
            //IF YOU DO NOT WANT TO DELETE THEM NOW - create DataTable 1 as class variable (now its in this method)
            //so you can access to it and remove rows later (on some button click)!
            if (table1 != null)
            {
                if (table2 != null)
                {
                    if (table1.Rows.Count == 100)
                    {
                        foreach (var dr in rowsToRemove)
                            table1.Rows.Remove(dr);
                    }
                    else
                        MessageBox.Show("Rows has already been deleted. Create new shuffle.");
                }
                else
                    MessageBox.Show("There is no numbers to delete yet. Please get 20 random numbers 1st.");
            }
            else
                MessageBox.Show("Please shuffle new numbers 1st.");
        }
    }
}

This is it.

commented: Don't spoonfeed solutions. -3

pyTony: Why not? I did nothing wrong. I pasted the solution here, and because I thought he might have difficulties to run the whole thing, gave him a link to the full project!!


And please remove your down vote. I worked hard for you to ruin all?

You come here, do some short (pointless post) and then you will judge us here? You could tell me if you have something to tell.
Shame on you then...

My post was not pointless as the post was originally in Computer Science forum and does not carry any scientific interest. I think you should know by now the rules of Daniweb as you have quite a few posts under your belt. You might anyway read my post in my signature but pay special attention for the smilies or read the classic by goddess Narue ;) http://www.daniweb.com/software-development/c/threads/78060:

Don't give away code!

You might feel inclined to help someone else. That's great! But don't solve the problem for them. Our goal is to help people to learn, and giving away answers doesn't achieve that goal. Naturally giving away the answer is a subjective thing, so just make sure that the person you help can't just take your code and turn it in for a grade. We want the people we help to do enough work to learn something meaningful.

YOu know what... I was only trying to help him, because I knew he would have come back and asking me for more info, how this, how that, and so on...
Thats why I decided to give him a code, to see exactly what he needs.

But ok, from now on, I will keep of giving people code in this kind of manner. But still no need to down voting my post.

I have found out that code snippets are good medium to help. Just make sure, they are enough different from request, that there is job left for OP to apply the principle.

You got it :)

I got where is the error sir...
I corrected like below...now it is working correctly.

private List<int> GenerateRandomNumbers(int maxValue, int NumbersToGet)
{
    List<int> list = new List<int>();
    int number;
    for (int i = 0; i <= NumbersToGet; i++)
    {
        
            number = r.Next(maxValue);

            while (!list.Contains(number))
            {
                list.Add(number);
            }
    }
    return list;
}

Thank you so much sir...I asked this help in so many forums...none of them replied...You only helped me to get an idea of how can i do it.

You was really helpful to me....Earlier i was very confused ...but because of your suggestion, even if i was not knowing debugging using break point i studied my own...Thank you for increasing my knowledge...


Ok, here we go...
I did the whole project for you. I will upload the solution in here, but you can download the full project HERE, so you can test it out. I hope I included all - most of the things I though they are important.

Code:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.IO;

namespace Jul25_1
{
    public partial class Form1 : Form
    {
        Random r = new Random();
        DataTable table1;
        DataTable table2;
        List<DataRow> rowsToRemove;

        public Form1()
        {
            InitializeComponent();            
        }

        private void buttonNewShuffle_Click(object sender, EventArgs e)
        {
            Get_100_RandomNumbes();
        }

        private void buttonGet20_Click(object sender, EventArgs e)
        {
            Get_20_SelectedNumbers();
        }

        private void buttonDelete_Click(object sender, EventArgs e)
        {
            DeleteRowsFromTable1();
        }
       
        private void Get_100_RandomNumbes()
        {
            table1 = new DataTable("table1");
            listBox2.DataSource = null;
            listBox2.Items.Clear();
            table2 = null;

            //I will use only one column, but the same is if you use more!
            table1.Columns.Add("Column 1", typeof(int));          

            //lets add some rows:
            for (int i = 0; i < 100; i++)
                table1.Rows.Add(i);
            
            //set data to listBox1:
            listBox1.DataSource = new BindingSource(table1, null);
            listBox1.DisplayMember = "Column 1";
        }

        private void Get_20_SelectedNumbers()
        {
            if (table1 != null)
            {
                if (table1.Rows.Count == 100)
                {
                    table2 = new DataTable("table2");

                    //create same column names as table1
                    table2 = table1.Clone();

                    int allRows = table1.Rows.Count; //total rows of table one
                    int myPick = 20;//You can use : r.Next(table1.Rows.Count); // this is your variable of how many rows random numbers to get!

                    List<int> myRandoms = GenerateRandomNumbers(allRows, myPick);

                    rowsToRemove = new List<DataRow>();
                    for (int i = 0; i < table1.Rows.Count; i++)
                    {
                        if (myRandoms.Contains(i)) //this will check for the correct row to select into table 2!
                        {
                            rowsToRemove.Add(table1.Rows[i]);
                            table2.Rows.Add();
                            foreach (DataColumn col in table1.Columns)
                            {
                                table2.Rows[table2.Rows.Count - 1][col.ColumnName] = table1.Rows[i][col.ColumnName];
                            }
                        }
                    }

                    //set data to listBox2:
                    listBox2.DataSource = new BindingSource(table2, null);
                    listBox2.DisplayMember = "Column 1";
                }
                else
                    MessageBox.Show("You have already did the selection of 20 numbers. Now you can shuffle again.");
            }
            else
                MessageBox.Show("Please shuffle new numbers 1st.");
        }

        private List<int> GenerateRandomNumbers(int maxValue, int NumbersToGet)
        {
            List<int> list = new List<int>();
            int number = 0;
            for (int i = 0; i < NumbersToGet; i++)
            {
                do
                {
                    number = r.Next(maxValue);
                }
                while (list.Contains(number));
                list.Add(number);
            }
            return list;
        }

        private void DeleteRowsFromTable1()
        {
            //remove rows from table1:
            //IF YOU DO NOT WANT TO DELETE THEM NOW - create DataTable 1 as class variable (now its in this method)
            //so you can access to it and remove rows later (on some button click)!
            if (table1 != null)
            {
                if (table2 != null)
                {
                    if (table1.Rows.Count == 100)
                    {
                        foreach (var dr in rowsToRemove)
                            table1.Rows.Remove(dr);
                    }
                    else
                        MessageBox.Show("Rows has already been deleted. Create new shuffle.");
                }
                else
                    MessageBox.Show("There is no numbers to delete yet. Please get 20 random numbers 1st.");
            }
            else
                MessageBox.Show("Please shuffle new numbers 1st.");
        }
    }
}

This is it.

YOu know what... I was only trying to help him, because I knew he would have come back and asking me for more info, how this, how that, and so on...
Thats why I decided to give him a code, to see exactly what he needs.

But ok, from now on, I will keep of giving people code in this kind of manner. But still no need to down voting my post.

Really sir,I was able to find the error by myself and correct it....now it is working...Thank you sir

hehe, np.
About that method which generates random numbers, it has to be like this:

private List<int> GenerateRandomNumbers(int maxValue, int NumbersToGet)
        {
            List<int> list = new List<int>();
            int number = 0;
            for (int i = 0; i < NumbersToGet; i++)
            {
                do
                {
                    number = r.Next(maxValue);
                }
                while (list.Contains(number));
                list.Add(number);
            }
            return list;
        }

Correct it. And up there you have a full solution!

hehe, np.
About that method which generates random numbers, it has to be like this:

private List<int> GenerateRandomNumbers(int maxValue, int NumbersToGet)
        {
            List<int> list = new List<int>();
            int number = 0;
            for (int i = 0; i < NumbersToGet; i++)
            {
                do
                {
                    number = r.Next(maxValue);
                }
                while (list.Contains(number));
                list.Add(number);
            }
            return list;
        }

Correct it. And up there you have a full solution!

It was not while sir,I unknowingly put 'while' doing copying here.Actually i had used if in my code.

private List<int> GenerateRandomNumbers(int maxValue, int NumbersToGet)
{
    List<int> list = new List<int>();
    int number;
    for (int i = 0; i <= NumbersToGet; i++)
    {
 
            number = r.Next(maxValue);
 
            if(!list.Contains(number))
            {
                list.Add(number);
            }
    }
    return list;
}
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.