I wrote a program to analyze a log file for a machine that my company repairs.
The program that runs the machine spits output into a text file (.log) and my program will analyze it and return the results of different calculations to the user.

The log file idealy looks like this most of the time

>25
233966
300156
89980

>26
232342
300157
90010

>25
235908
300156
90020

>22
242106
300154
90000

This is ideal formatting and USUALLY is the case. Now out of each of those 4 line "paragraphs" only the first and the last lines (eg >22 and 9000) are analyzed. The middle two lines are ignored. So the program that I wrote reads in 5 line "chunks" until the end of the file. It reads the first line, removes the ">" and stores the value into an int array. It then reads the next 3 lines and stores the last line read into another integer array. After reading is done, the arrays are analyzed and output is displayed to the user.

However, occasionally the program that interacts with the machine spits out random newlines and the formatting is different, like this:

>25
233966

300156
89980

>26
232342
300157
90010
>25
235908
300156
90020

>22
242106
300154

90000
>25
236561
300155
90020
>24

237751
300155
90010

So I am wondering if anyone could give input to me on how to write an algorithm that will read this data correctly regardless of the newlines. Is there a way to read a line and ignore blank lines?
I originally wrote this program in C++ but I made a C# GUI version so that it would be easier for the users to use.

Any ideas?
Thanks
-Weasel

Tell me, which values you want to get from this upper example? In each part are these 2 numbers:
1. the number on the right side of th ">" mark, and
2. the number which is just above the next ">" mark?

So in this example:
>25
233966

300156
89980

you would like to get 25 and 89980. Am I right?

Edited 5 Years Ago by Mitja Bonca: n/a

using System;
using System.IO;

namespace TestBed {
    class TestBed {
        static void Main() {
            StreamReader sr = new StreamReader("Test.txt");
            String currentLine = null;
            String startLine = null;
            String lastGoodLine = null;
            Boolean lookingForStart = true;

            while (sr.EndOfStream == false) {
                if (lookingForStart) {
                    currentLine = sr.ReadLine();
                    if (currentLine.StartsWith(">")) {
                        startLine = currentLine;
                        lookingForStart = false;
                    }
                } else {
                    if (sr.Peek() == '>') {
                        Console.WriteLine("Start Line -> {0}{1}End Line ->{2}", startLine, Environment.NewLine, lastGoodLine);
                        lookingForStart = true;
                    } else {
                        currentLine = sr.ReadLine();
                        if (String.IsNullOrEmpty(currentLine.Trim()) == false) {
                            lastGoodLine = currentLine;
                        }
                    }
                }
            }

            if (lookingForStart == false) {
                Console.WriteLine("Start Line -> {0}{1}End Line ->{2}", startLine, Environment.NewLine, lastGoodLine);
            }

            Console.ReadLine();

        }
    }
}

Tell me, which values you want to get from this upper example? In each part are these 2 numbers:
1. the number on the right side of th ">" mark, and
2. the number which is just above the next ">" mark?

So in this example:
>25
233966

300156
89980

you would like to get 25 and 89980. Am I right?

Yes that is correct.

This is something you would like to have I guess:

//creating a generci list for storing the wanted values
            List<int> list = new List<int>();

            using (StreamReader sr = new StreamReader(@"C:\1\test25.txt"))
            {
                string line;
                int value;
                int counter = 0;
                while ((line = sr.ReadLine()) != null)
                {
                    if (line.Contains(">"))
                    {
                        value = Convert.ToInt32(line.Remove(0, 1));
                        list.Add(value);
                    }
                    if (line != " ")
                        counter++;
                    if (counter == 4)
                    {
                        list.Add(Convert.ToInt32(line));
                        counter = 0;
                    }
                }
            }

What the code does, it to check if the line contains the ">" char. If it does, it addes the number beside the char to the array (in my case I used a generic list, whihc is way more appropriate to use then an array). Then if goes row by row forward.
Every this part consist of 4 NOT EMPTY rows. So there is a counter which counts all not empty rows (if row is empty there is no counting done). When counter reachers 4 (4th not empty row in the part) it add the number to the list again, and resets the couner. And story goes one form beginning.

I hope its understanadable enough.

Thank you both, they both look like great algorithms. I will play around with my code and use your suggestions and let you know when I have gotten it to work. Thank you.

This article has been dead for over six months. Start a new discussion instead.