Hi there,

I have been struggling with this for quite a while now. I am a novice in .net so no wonder I can figure it out.

I have a tab delimited file, where the sequence of lines should be
01UN1........................
04UN1..........................
09UN1........................
01UN2....................
04UN2.............
04UN2...............
09UN2................


However sometimes in the file we have redundant 04 rows, which need to be removed , so that the last one is the true one that should tremain in the file, the one in red should be deleted.
I started creating a console application and from what I read online and in books I figured I will need the file in an array.So I managed to input the file into a string array and then write it to a new text file.
Here is my code:

Imports System
Imports System.Console
Imports System.IO
Module Module1
    Sub Main()

       
        ' text document to array
        Dim myarray As String()
        myarray = File.ReadAllLines("C:\Original.txt")
        Dim i As Integer = 0
        i = i + 1
        For i = 0 To myarray.GetUpperBound(0)
        Next
        'write the array to a new text file
        System.IO.File.WriteAllLines("C:\newfile.txt", myarray)

End Sub
End Module

I now need to loop through every row and take the part 01UN1 from the line and see if the next line starts with 04UN1, if so check if the line after starts with 09UN1. if that's correct then write the three lines in the new file, then go to next line that starts with 01it could be 01UN2 for example and do the same check. If there are more hatn one 04UN2 lines then delete all and leave only the last one then continue onwards until the end of the file.
the unique part of every row would be the string between 3 and 6character of the line. So if we have as unique part UN1 then we need only have rows :
01UN1
04UN1
09Un1
and then continue from the following line that starts with 01 and has different unique identifier for example UN5 to say.
I am not sure how do I reference each row from the array, also I need to set that unique identifier as a variable and then use if conditions . Not even sure i that's the right track.
Also I cannot be sure what the list of unique parts for every file would be so I can put them in a separate file and compare against them, so checks should be made against unique parts present in the files at rows starting with 01. Also we would only have redundant rows starting with 04.

If someone can help me out here I would be really grateful.

Thanks in advance

Sub Main()
        ' text document to array
        Dim myarray As String() = File.ReadAllLines("C:\Original.txt")
        Array.Reverse(myarray) 'on this way you have the newest entry always as first
        Dim tmpArray As New ArrayList 'create temp array to hold the final result

        For Each _substring As String In myarray
            If Not isInArray(_substring.Substring(0, 5), tmpArray) Then
                tmpArray.Add(_substring)
            End If
        Next
        myarray = CType(tmpArray.ToArray(GetType(String)), String()) 'copy result back to array
        tmpArray = Nothing 'not needed anymore
        Array.Sort(myarray) 'if needed
        'write the array to a new text file
        System.IO.File.WriteAllLines("C:\Original2.txt", myarray)
        Console.WriteLine("done")
        Console.Read()
    End Sub

    Private Function isInArray(ByVal _substring As String, ByVal tmpArray As ArrayList) As Boolean
        For Each item As String In tmpArray
            If item.StartsWith(_substring) Then
                Return True
            End If
        Next
        Return False
    End Function

input:
01UN1........................
04UN1..........................
09UN1........................
01UN2....................
04UN2............1
04UN2..............2
09UN2................

output:
09UN2................
04UN2..............2
01UN2....................
09UN1........................
04UN1..........................
01UN1........................

Thank you so much for the code it works perfectly.
The only adjustment I need is in the resulting file I need all records to be in this order:
01UN1..........
04UN1...........
09UN1..........
01UN2..........
04UN2...........
09UN2............
instead of :
09UN2................
04UN2..............2
01UN2....................
09UN1........................
04UN1..........................
01UN1........................

How can I reverse back to as it has been in the original file
I tried reding the Original2 .txt file into an array and then used reverse and write to a third file but it didn't work
Here is the code

Sub Reverse()
' read the clean file in a second array
        Dim myarray1 As String() = File.ReadAllLines("C:\Original2.txt")
        ' reverse 
        Array.Reverse(myarray1)
        System.IO.File.WriteAllLines("C:\Original3.txt", myarray1)


    End Sub

would I need a loop again and a function ?

thanks again . you are a star?

to get this done just reverse the array again

Sub Main()
        ' text document to array
        Dim myarray As String() = File.ReadAllLines("C:\Original.txt")
        Array.Reverse(myarray) 'on this way you have the newest entry always as first
        Dim tmpArray As New ArrayList 'create temp array to hold the final result

        For Each _substring As String In myarray
            If Not isInArray(_substring.Substring(0, 5), tmpArray) Then
                tmpArray.Add(_substring)
            End If
        Next
        myarray = CType(tmpArray.ToArray(GetType(String)), String()) 'copy result back to array
        tmpArray = Nothing 'not needed anymore
        Array.Reverse(myarray) 'reverse it back to orginal
        'write the array to a new text file
        System.IO.File.WriteAllLines("C:\Original2.txt", myarray)
        Console.WriteLine("done")
        Console.Read()
    End Sub

    Private Function isInArray(ByVal _substring As String, ByVal tmpArray As ArrayList) As Boolean
        For Each item As String In tmpArray
            If item.StartsWith(_substring) Then
                Return True
            End If
        Next
        Return False
    End Function

input:
01UN1........................
04UN1..........................
09UN1........................
01UN2....................
04UN2............1
04UN2..............2
09UN2................

output:
01UN1........................
04UN1..........................
09UN1........................
01UN2....................
04UN2..............2
09UN2................

Thank you very much :).
More than perfect. Thanks for commenting out everything as well so I know what you did.
Marking the tread as solved :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.