Hi Guys,
My task is a console application that is meant to read 2 text files full of words and to output one text file from both of these. The outpuuted text file must include data from both text files but remove any repeats. The code below is just not outputting any data and cant see for the life of me why not. Appreciate any help

int i = 0;
                    while(sr.ReadLine() != null)
                    {
                   
                    strList1.Add(sr.ReadLine());
                     i++;
                    }
                    int j = 0;
                    foreach(int count in strList1[j])
                    {
                    j++;
                    while(sr.ReadLine() != null)
                    {
                    sw.WriteLine(strList1[j]);
                    }
                    }
                    int k = 0;
                    while(sr2.ReadLine() != null)
                    {
                    k++;
                    strList2.Add(sr2.ReadLine());
                    }
                    int l = 0;
                    foreach(int count2 in strList2[l])
                    {
                    l++;
                    while(sr2.ReadLine() != null)
                    {
                    if (strList2.Contains(sr2.ReadLine()))
                    {
                    //do nothing
                    }
                    else
                    {
                    sw.WriteLine(strList1[j]);
                    }
                    }
                    }
                
                }

Are you allowed to use Linq?
You could use File.ReadAllLines() (twice)
...putting the data into an array of strings.

You could then merge the two arrays and call .Distinct() (if linq is allowed).
If it is, please let me know and I will make an example.

Are you allowed to use Linq?
You could use File.ReadAllLines() (twice)
...putting the data into an array of strings.

You could then merge the two arrays and call .Distinct() (if linq is allowed).
If it is, please let me know and I will make an example.

I suppose there is no reason why i cant but I would like to be able to do it this way if possible.

Well, to be safe:
I would first suggest a method that can load either text file into a common repository:

public static bool LoadByTechnique0(List<string> lst_strData, string strFileName, ref string strError)
      {
         bool blnRetVal = true;

         try
         {
            List<string> lst_strData0 = new List<string>();

            using (StreamReader fileIn = new StreamReader(strFileName))
            {
               while (!fileIn.EndOfStream)
               {
                  lst_strData.Add(fileIn.ReadLine().Trim());
               }

               fileIn.Close();
            }
         }
         catch (Exception exc)
         {
            blnRetVal = false;
            strError = exc.Message;
         }

         return blnRetVal;
      }

...when called will add the contents of a text file to a List<string>

I suggest loading the data in one action, parsing it in another and exporting in another.

Since you won't know what to keep or dump, you should just load each file (into the same list) twice.

Edited 4 Years Ago by thines01: clarity

Without using Linq, this should work:

List<string> outputStrings = new List<string>();

using (StreamReader sr = new StreamReader(filepath))
{
    //file 1, read everything
    while (!sr.EndOfStream)
        outputStrings.Add(sr.ReadLine());

    sr.Close();
}
using (StreamReader sr = new StreamReader(filepath2))
{
    while (!sr.EndOfStream)
    {   
        //file2, read a temp string in, if the outputstrings doesn't contain it, then add it
        string temp = sr.ReadLine();
        if (!outputStrings.Contains(temp))
            outputStrings.Add(temp);
    }
    sr.Close();
}
using (StreamWriter sr = new StreamWriter(outputpath))
{
    //write the output file
    foreach (string s in outputStrings)
        sr.WriteLine(s);
    sr.Close();
}

Edited 4 Years Ago by skatamatic: n/a

@skatamatic: You wouldn't recommend reading the two files with the same function (for the purposes of reuse)?

I'm thinking of a technique that does not require the data to be "processed" until after it is read.

string strError = "";
         List<string> lst_strData = new List<string>();
         List<string> lst_strFilesToLoad =
            new List<string> { "../../TextFile1.txt", "../../TextFile2.txt"};

         foreach(string strFile in lst_strFilesToLoad)
         {
            if (!LoadByTechnique0(lst_strData, strFile, ref strError))
            {
               Console.WriteLine("Could not load file: {0} : {1}", strFile, strError);
               break;
            }
         }

If the files are small (less than a couple of GB), it's not an issue and the framework I use can sort and distinct the list.

...back in the linq world:

List<string> lst_strNew = lst_strData.OrderBy(s => s).Distinct().ToList();
File.WriteAllLines("../../TextFileOut.txt", lst_strNew.ToArray());

Edited 4 Years Ago by thines01: clarity

The way I would REALLY prefer to do it would be (without exception handling):

public static void Technique1()
      {
         List<string> lst_strData1 =
            File.ReadAllLines("../../TextFile1.txt").ToList()
            .Union(File.ReadAllLines("../../TextFile2.txt")).ToList();

         File.WriteAllLines("../../TextFileOut.txt", lst_strData1.OrderBy(s => s)
            .Distinct().ToArray());
      }
Comments
Good LINQ solution

Yer that linq way looks really good but except I would like to know what is wrong with mine so I can learn from my mistakes. Below is my edited code, it adds both data to text file but does not delete replicated data

while (!sr.EndOfStream)
                    {
                        strList1.Add(sr.ReadLine());

                    }
                    foreach (string s in strList1)
                    {
                       sw.WriteLine(s);
                    }
                   
                    while (!sr2.EndOfStream)
                    {
                        strList2.Add(sr2.ReadLine());
                    }

                    foreach (string st in strList2) 
                    {
                        if (sr.ReadToEnd().Contains(st)) //WHERE I AM TRYING TO DELETE REPLICATS
                        {
                            
                        }
                        else
                        {
                            sw.WriteLine(st);
                        }

In the first 10 lines, you're writing to the output file without getting all of the input data.

You can't eliminate duplicates or sort if the data is already written.

So (short answer): Don't do it that way.

Do the sorting and filtering in RAM and then export the data.

Edited 4 Years Ago by thines01: n/a

@skatamatic: You wouldn't recommend reading the two files with the same function (for the purposes of reuse)?

He can implement this any way he wants, but ultimately this is the functionality he needs. For scalability you could implement this as hardcore as you would like, such as:

public List<string> GetDistinctStringsFromFiles(List<string> FilePaths, List<Exception> errors)
        {
            List<string> distinctStrings = new List<string>();
            foreach (string sPath in FilePaths)
            {
                try
                {
                    using (StreamReader sr = new StreamReader(sPath))
                    {
                        while (!sr.EndOfStream)
                        {
                            string temp = sr.ReadLine();
                            if (!distinctStrings.Contains(temp))
                                distinctStrings.Add(temp);
                        }
                        sr.Close();
                    }
                }
                catch (Exception ex)
                {
                    errors.Add(ex);
                }
            }
            return commonStrings; //Write this to a file if you want...
        }

Same logic, different implementation.

Edited 4 Years Ago by skatamatic: n/a

This article has been dead for over six months. Start a new discussion instead.