text file help

Question

winnitbaker 0 Newbie Poster

12 Years Ago

Hi Guys,
My task is a console application that is meant to read 2 text files full of words and to output one text file from both of these. The outpuuted text file must include data from both text files but remove any repeats. The code below is just not outputting any data and cant see for the life of me why not. Appreciate any help

int i = 0;
                    while(sr.ReadLine() != null)
                    {
                   
                    strList1.Add(sr.ReadLine());
                     i++;
                    }
                    int j = 0;
                    foreach(int count in strList1[j])
                    {
                    j++;
                    while(sr.ReadLine() != null)
                    {
                    sw.WriteLine(strList1[j]);
                    }
                    }
                    int k = 0;
                    while(sr2.ReadLine() != null)
                    {
                    k++;
                    strList2.Add(sr2.ReadLine());
                    }
                    int l = 0;
                    foreach(int count2 in strList2[l])
                    {
                    l++;
                    while(sr2.ReadLine() != null)
                    {
                    if (strList2.Contains(sr2.ReadLine()))
                    {
                    //do nothing
                    }
                    else
                    {
                    sw.WriteLine(strList1[j]);
                    }
                    }
                    }
                
                }

3 Contributors
9 Replies
94 Views
8 Hours Discussion Span
Latest Post 12 Years Ago Latest Post by skatamatic

thines01 401 Postaholic

12 Years Ago

Are you allowed to use Linq?
You could use File.ReadAllLines() (twice)
...putting the data into an array of strings.

You could then merge the two arrays and call .Distinct() (if linq is allowed).
If it is, please let me know and I will make an example.

thines01 401 Postaholic

12 Years Ago

Well, to be safe:
I would first suggest a method that can load either text file into a common repository:

public static bool LoadByTechnique0(List<string> lst_strData, string strFileName, ref string strError)
      {
         bool blnRetVal = true;

         try
         {
            List<string> lst_strData0 = new List<string>();

            using (StreamReader fileIn = new StreamReader(strFileName))
            {
               while (!fileIn.EndOfStream)
               {
                  lst_strData.Add(fileIn.ReadLine().Trim());
               }

               fileIn.Close();
            }
         }
         catch (Exception exc)
         {
            blnRetVal = false;
            strError = exc.Message;
         }

         return blnRetVal;
      }

...when called will add the contents of a text file to a List<string>

I suggest loading the data in one action, parsing it in another and exporting in another.

Since you won't know what to keep or dump, you should just load each file (into the same list) twice.

Edited 12 Years Ago by thines01 because: clarity

thines01 401 Postaholic

12 Years Ago

The way I would REALLY prefer to do it would be (without exception handling):

public static void Technique1()
      {
         List<string> lst_strData1 =
            File.ReadAllLines("../../TextFile1.txt").ToList()
            .Union(File.ReadAllLines("../../TextFile2.txt")).ToList();

         File.WriteAllLines("../../TextFileOut.txt", lst_strData1.OrderBy(s => s)
            .Distinct().ToArray());
      }

skatamatic commented: Good LINQ solution +8

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

winnitbaker 0 Newbie Poster · Answer 1 · 2012-01-20T00:47:32+00:00

Are you allowed to use Linq?
You could use File.ReadAllLines() (twice)
...putting the data into an array of strings.
You could then merge the two arrays and call .Distinct() (if linq is allowed).
If it is, please let me know and I will make an example.

I suppose there is no reason why i cant but I would like to be able to do it this way if possible.

skatamatic 371 Practically a Posting Shark · Answer 2 · 2012-01-20T01:06:17+00:00

Without using Linq, this should work:

List<string> outputStrings = new List<string>();

using (StreamReader sr = new StreamReader(filepath))
{
    //file 1, read everything
    while (!sr.EndOfStream)
        outputStrings.Add(sr.ReadLine());

    sr.Close();
}
using (StreamReader sr = new StreamReader(filepath2))
{
    while (!sr.EndOfStream)
    {   
        //file2, read a temp string in, if the outputstrings doesn't contain it, then add it
        string temp = sr.ReadLine();
        if (!outputStrings.Contains(temp))
            outputStrings.Add(temp);
    }
    sr.Close();
}
using (StreamWriter sr = new StreamWriter(outputpath))
{
    //write the output file
    foreach (string s in outputStrings)
        sr.WriteLine(s);
    sr.Close();
}

thines01 401 Postaholic Team Colleague Featured Poster · Answer 3 · 2012-01-20T01:30:52+00:00

@skatamatic: You wouldn't recommend reading the two files with the same function (for the purposes of reuse)?

I'm thinking of a technique that does not require the data to be "processed" until after it is read.

string strError = "";
         List<string> lst_strData = new List<string>();
         List<string> lst_strFilesToLoad =
            new List<string> { "../../TextFile1.txt", "../../TextFile2.txt"};

         foreach(string strFile in lst_strFilesToLoad)
         {
            if (!LoadByTechnique0(lst_strData, strFile, ref strError))
            {
               Console.WriteLine("Could not load file: {0} : {1}", strFile, strError);
               break;
            }
         }

If the files are small (less than a couple of GB), it's not an issue and the framework I use can sort and distinct the list.

...back in the linq world:

List<string> lst_strNew = lst_strData.OrderBy(s => s).Distinct().ToList();
File.WriteAllLines("../../TextFileOut.txt", lst_strNew.ToArray());

winnitbaker 0 Newbie Poster · Answer 4 · 2012-01-20T01:36:07+00:00

Yer that linq way looks really good but except I would like to know what is wrong with mine so I can learn from my mistakes. Below is my edited code, it adds both data to text file but does not delete replicated data

while (!sr.EndOfStream)
                    {
                        strList1.Add(sr.ReadLine());

                    }
                    foreach (string s in strList1)
                    {
                       sw.WriteLine(s);
                    }
                   
                    while (!sr2.EndOfStream)
                    {
                        strList2.Add(sr2.ReadLine());
                    }

                    foreach (string st in strList2) 
                    {
                        if (sr.ReadToEnd().Contains(st)) //WHERE I AM TRYING TO DELETE REPLICATS
                        {
                            
                        }
                        else
                        {
                            sw.WriteLine(st);
                        }

thines01 401 Postaholic Team Colleague Featured Poster · Answer 5 · 2012-01-20T01:46:53+00:00

In the first 10 lines, you're writing to the output file without getting all of the input data.

You can't eliminate duplicates or sort if the data is already written.

So (short answer): Don't do it that way.

Do the sorting and filtering in RAM and then export the data.

skatamatic 371 Practically a Posting Shark · Answer 6 · 2012-01-20T01:54:28+00:00

@skatamatic: You wouldn't recommend reading the two files with the same function (for the purposes of reuse)?

He can implement this any way he wants, but ultimately this is the functionality he needs. For scalability you could implement this as hardcore as you would like, such as:

public List<string> GetDistinctStringsFromFiles(List<string> FilePaths, List<Exception> errors)
        {
            List<string> distinctStrings = new List<string>();
            foreach (string sPath in FilePaths)
            {
                try
                {
                    using (StreamReader sr = new StreamReader(sPath))
                    {
                        while (!sr.EndOfStream)
                        {
                            string temp = sr.ReadLine();
                            if (!distinctStrings.Contains(temp))
                                distinctStrings.Add(temp);
                        }
                        sr.Close();
                    }
                }
                catch (Exception ex)
                {
                    errors.Add(ex);
                }
            }
            return commonStrings; //Write this to a file if you want...
        }

Same logic, different implementation.