Hi Guys,

Spent lots of time looking around , reading MSDN help and whatnot.
I don`t seem to get the right answer still.
What I would like to do is To parse XML file to an array in vb.net console application.The XML file is very simple. The thing is that after I have the values of the XML file in an array I will need to be able to use the values in the array as parameters so I can compare against them while reading a txt file row by row.
So far I got the XML file read and displayed in the cmd prompt window.No idea how would I get further to accomplish the rest.
I think If I can get the first step bind the XML elements in an array the rest I will manage , although never done that before.
Here is an example of the XML file

Code:

<?xml version="1.0" encoding="utf-8"?>
<Strings>
<String>
		<Name>**ABC</Name>
		<CountStartPosition>5 </CountStartPosition>
		<CountEndPosition>10</CountEndPosition>
	</String>
		<String>
		<Name>AAAAAAAA</Name>
		<CountStartPosition>96</CountStartPosition>
		<CountEndPosition>103</CountEndPosition>
	</String>
	<String>
		<Name>BBBBBBBB</Name>
		<CountStartPosition>96</CountStartPosition>
		<CountEndPosition>103</CountEndPosition>
	</String>
		<String>
		<Name>CCCCCCCC</Name>
		<CountStartPosition>96</CountStartPosition>
		<CountEndPosition>103</CountEndPosition>
	</String>
	</Strings>
	<Treaty>
		<Name>WRHCLT4A</Name>
		<CountStartPosition>96</CountStartPosition>
		<CountEndPosition>103</CountEndPosition>
	</Treaty>
	<Treaty>
		<Name>WRHCLT5A</Name>
		<CountStartPosition>96</CountStartPosition>
		<CountEndPosition>103</CountEndPosition>
	</Treaty>
<Treaty>
		<Name>##T</Name>
		<CountStartPosition>1</CountStartPosition>
		<CountEndPosition>3</CountEndPosition>
	</Treaty>
</Treaties>

Ok a little bit explaining what is what
String Name element is the string bit I will need to lookup/ compare against in the text file , it will be of type String.
CountStart end positions will be used to tell the program from what character in the text file in each row would the search for the particular string start searching the string and CountEnd is the last Character where the search in the row in the text file should stop.These should be of type integer
For example let say we have a text file that has a row like this
VVNNMN HUIMKOIB BGTUUOLKJHG VFAAAAAAAANMMM
and let`s suppose the AAAAAAAA is between the 96 and 103 character in every row. So what we need to do is verify if any of the strings in the XML document with their corresponding start and end count positions are present in a row in the original file.
Every row will be checked.
I am not really sure how I can pass the element values in an array and then maybe in the code for reading the text file I should refer to every value in the array as parameter.
I would be really grateful if anyone can help me out.
So far what I managed to write as code is:

Code:

Module Module1



    Sub Main()
        Dim m_xmld As XmlDocument
        Dim m_nodelist As XmlNodeList
        Dim m_node As XmlNode

        'Create the XML Document
        m_xmld = New XmlDocument()

        'Load the Xml file
        m_xmld.Load("path\XMLFileName.xml")

       

        'Get the list of name nodes 
        m_nodelist = m_xmld.SelectNodes("/Strings/String")

        'Loop through the nodes 

        For Each m_node In m_nodelist

            'Get the Name Element Value
            Dim NameValue = m_node.ChildNodes.Item(0).InnerText

            'Get the Count Start Position Value
            Dim CountStartPositionValue = m_node.ChildNodes.Item(1).InnerText
            'Get the Count End Position Value
            Dim CountEndPositionValue = m_node.ChildNodes.Item(2).InnerText



            'Write Result to the Console
            Console.Write(" Name: " & NameValue & " CountStartPositionValue: " _
            & CountStartPositionValue & " CountEndPositionValue: " _
            & CountEndPositionValue)
            Console.Write(vbCrLf)
        Next
        Console.Read()



    End Sub

End Module

If someone could help me out here I would be really grateful.
Thaks in advance

Recommended Answers

All 8 Replies

Here's a function to parse XML file to arrays

Imports System.Xml

Public Sub ImportXML(ByVal FileName As String, _
  ByRef Names() As String, ByRef CountStart() As Integer, ByRef CountEnd() As Integer)
  ' 
  Dim FStream As FileStream
  Dim XReader As Xml.XmlTextReader
  Dim XDoc As XPath.XPathDocument
  Dim XNav As XPath.XPathNavigator
  Dim XIter As XPath.XPathNodeIterator
  Dim TempStr As String
  Dim TempInt As Integer
  Dim TempCount As Integer

  TempCount = 0
  FStream = New FileStream(FileName, FileMode.Open, FileAccess.Read)
  XReader = New Xml.XmlTextReader(FStream)
  XDoc = New XPath.XPathDocument(XReader)
  XNav = XDoc.CreateNavigator()

  '
  XNav.MoveToRoot()
  XIter = XNav.Select("descendant::Strings/String")
  While XIter.MoveNext
    ReDim Preserve Names(TempCount)
    ReDim Preserve CountStart(TempCount)
    ReDim Preserve CountEnd(TempCount)
    ' Name
    TempStr = XIter.Current.SelectSingleNode("descendant::Name").Value
    Names(TempCount) = TempStr
    ' CountStartPosition
    TempStr = XIter.Current.SelectSingleNode("descendant::CountStartPosition").Value
    If Integer.TryParse(TempStr, TempInt) Then
      CountStart(TempCount) = TempInt
    Else
      CountStart(TempCount) = -1 ' -1 indicates invalid value
    End If
    ' CountEndPosition
    TempStr = XIter.Current.SelectSingleNode("descendant::CountEndPosition").Value
    If Integer.TryParse(TempStr, TempInt) Then
      CountEnd(TempCount) = TempInt
    Else
      CountEnd(TempCount) = -1 ' -1 indicates invalid value
    End If
    TempCount += 1
  End While

  FStream.Close()

End Sub

I wrote it in a bit hurry so there's no exception handling or that many comments in the code. Here's how you call it

Dim Names(0) As String
Dim CountStart(0) As Integer
Dim CountEnd(0) As Integer

ImportXML("D:\test.xml", Names, CountStart, CountEnd)

Your original XML sample wasn't valid xml. I dropped all "treaty" parts and used the following xml for testing

<?xml version="1.0" encoding="utf-8"?>
<Strings>
<String>
		<Name>**ABC</Name>
		<CountStartPosition>5 </CountStartPosition>
		<CountEndPosition>10</CountEndPosition>
	</String>
		<String>
		<Name>AAAAAAAA</Name>
		<CountStartPosition>96</CountStartPosition>
		<CountEndPosition>103</CountEndPosition>
	</String>
	<String>
		<Name>BBBBBBBB</Name>
		<CountStartPosition>96</CountStartPosition>
		<CountEndPosition>103</CountEndPosition>
	</String>
		<String>
		<Name>CCCCCCCC</Name>
		<CountStartPosition>96</CountStartPosition>
		<CountEndPosition>103</CountEndPosition>
	</String>
</Strings>

HTH

commented: He is great in answering and explaining.I can`t say enough to thank him :) +1

Hi Teme64 ,

Thank you very much for posting the code and editing the XML .
I run the code and to me it seems to work.I am not sure if I have to see in the command prompt window the values that are currently bound to an array.When I run the code in the command prompt window I see nothing.

I have added some more code , where I read a tab delimited text file where in every row I have to compare that in the specific position in every row Names current value exists or doesn`t exist .If it does wite the line to a file Match.txt .
So I am not sure how the CountStart, CountEnd parameters if I want to use them in expression for example.
If Line is a single line from the file I want to tell the program to
check if in that line between starting position @CountStart, end position @CountEnd, we have string value (a part of the line) that is equaling the curent @Names.
I am not sure if that can be done.I have put a line for that using expression , but it doesn`t seem correct to me :P
The full code I have so far is :

Imports System
Imports System.IO
Imports System.Xml



Module Module1
    Public Sub ImportXML(ByVal FileName As String, ByRef Names() As String, _
 ByRef CountStart() As Integer, ByRef CountEnd() As Integer)
        Dim FStream As FileStream
        Dim XReader As System.Xml.XmlReader
        Dim XNav As XPath.XPathNavigator
        Dim XDoc As XPath.XPathDocument
        Dim XIter As XPath.XPathNodeIterator
        Dim TempStr As String
        Dim TempInt As Integer
        Dim TempCount As Integer


        TempCount = 0
        FStream = New FileStream(FileName, FileMode.Open, FileAccess.Read)
        XReader = New Xml.XmlTextReader(FStream)
        XDoc = New XPath.XPathDocument(XReader)
        XNav = XDoc.CreateNavigator()


        XNav.MoveToRoot()
        XIter = XNav.Select("descendant::Treaties/Treaty")
        While XIter.MoveNext
            ReDim Preserve Names(TempCount)
            ReDim Preserve CountStart(TempCount)
            ReDim Preserve CountEnd(TempCount)
            ' Name    
            TempStr = XIter.Current.SelectSingleNode("descendant::Name").Value
            Names(TempCount) = TempStr
            ' CountStartPosition    
            TempStr = XIter.Current.SelectSingleNode("descendant::CountStartPosition").Value
            If Integer.TryParse(TempStr, TempInt) Then
                CountStart(TempCount) = TempInt
            Else

                CountStart(TempCount) = -1
                ' -1 indicates invalid value    
            End If
            ' CountEndPosition    
            TempStr = XIter.Current.SelectSingleNode("descendant::CountEndPosition").Value
            If Integer.TryParse(TempStr, TempInt) Then
                CountEnd(TempCount) = TempInt
            Else
                CountEnd(TempCount) = -1
                ' -1 indicates invalid value    
            End If
            TempCount += 1
        End While

        FStream.Close()













    End Sub



    Sub Main()
        Dim Names(0) As String
        Dim CountStart(0) As Integer
        Dim CountEnd(0) As Integer

        ImportXML("C:\Strings.xml", Names, CountStart, CountEnd)
        Dim Line As String
        Dim FileToSplit As New FileStream("C:\FileToSplit.txt", FileMode.Open)
        Dim stream As New StreamReader(FileToSplit)
        stream.BaseStream.Seek(0, SeekOrigin.Begin)
        Dim FileMatch As New FileStream("C:\FileMatch.txt", FileMode.OpenOrCreate)
        Dim filetosplitstream As New IO.StreamWriter(FileMatch)
        Console.SetIn(stream)
        Console.SetOut(filetosplitstream)
        Line = Mid(Line, CountStart(0), CountEnd(0))



        While stream.Peek() > -1

            Line = Console.ReadLine()


            If Line.ToString = Names(0) Then

                Console.WriteLine(Line)


            End If

        End While

        stream.Close()
        filetosplitstream.Close()








    End Sub



End Module

Ok the bits in green are where I want to use the current values in the array, I need for every row in the text file to check against all the values in the array, loop throug it and then go to the next row of the file and do the same and so on.
The code without parameters works because I have used it before, but using variables makes it dificult for me.
If you could help with this will be greatly appreciated.
Thanks so much

Here's a matching function

Public Function MatchString(ByVal OneLine As String, ByVal StringToMatch As String, _
  ByVal StartPos As Integer, ByVal EndPos As Integer) As Boolean
  ' Check if StringToMatch is a sub string in OneLine at position StartPos.
  ' StringToMatch has to be of length (EndPos - StartPos), otherwise it can't be a match.
  ' If the match is found, returns True.

  ' 1. Check StringToMatch and OneLine (input validation)
  ' if it's a zero length string, return false
  ' If it's an empty string, return False
  If String.IsNullOrEmpty(StringToMatch) OrElse String.IsNullOrEmpty(OneLine) Then
    Return False
  End If

  ' 2. Check StartPos <= 0 and StartPos < EndPos (these are valid values, otherwise return False
  If StartPos < 0 OrElse StartPos >= EndPos Then
    Return False
  End If

  ' 3. Check that StringToMatch is of length (EndPos - StartPos)
  If StringToMatch.Length <> ((EndPos - StartPos) + 1) Then
    ' Length didn't match so StringToMatch can't be found
    Return False
  End If

  ' 4. Check that StartPos and EndPos are "inside" of OneLine
  If StartPos >= OneLine.Length OrElse EndPos > OneLine.Length Then
    Return False
  End If

  ' At this point all the input validation is done
  ' 5. Check if StringToMatch is found. Case insensitive match. Change True -> False for a case sensitive match
  If String.Compare(OneLine.Substring(StartPos, (EndPos - StartPos) + 1), StringToMatch, True) = 0 Then
    Return True
  Else
    ' No match
    Return False
  End If

End Function

I tried to add as much comments as possible to make it easy to modify. One point to notice is that a strings indexing starts from zero i.e. first char is at index zero and so on. If your xml file uses 1-based indexing, subtract one before passing Start and End indices.

Here's few examples how to call it

MessageBox.Show(MatchString("abcde", "BC", 1, 2).ToString) ' True
MessageBox.Show(MatchString("abcde", "BC", 5, 6).ToString) ' False (StartPos EndPos not "inside" OneLine)
MessageBox.Show(MatchString("abcde", "aBCde", 0, 4).ToString) ' True
MessageBox.Show(MatchString("abcde", "wbcde", 0, 2).ToString) ' False ("abc" <> "wbc")

It's possible that I missed some point. But try this code and if there's something wrong or missing, we'll define and fix the code.

Hi Teme64,

thanks again for the code and explanatons :)
the last thing I am not sure of is the following.Right we have two functions now one creates an array and the other one checks start end position of a string and value of it.
ok how do I tell in the main method the application to create a new file and write rows there in case
1. checked row contains the first string in the arrow on the specified positions in the array or contains any of the array String in the specified positions in the xml file
for example say we have row
ADGHNNN AAAAAA VBGHY
and in the xml this is our second string
with description
<string>
<name>AAAAAA</name>
<countStart>10</countStart>
<countEnd>16</countEnd>
so we have match on position and value in the checked row in the file so we need it to be written in a MatchFile.txt.
Sorry to be such a pain but it sounds to me we have to have something like loop in the loop.Loop through the rows in the file and then for each row loop trough string elements of the xml and compare.

Thanks in advance

Right we have two functions now one creates an array and the other one checks start end position of a string and value of it.

Yes, now you have to combine them. And you do need a loop since the content of xml file is now in the arrays.

Here's the re-written While loop from your code

Dim i As Integer ' Loop counter
Dim bMatch As Boolean ' Indicates if we found a match
While Stream.Peek() > -1
  Line = Console.ReadLine()
  bMatch = False
  ' Loop xml file
  For i = 0 To Names.GetUpperBound(0)
    ' Check for a match
    If MatchString(Line, Names(i), CountStart(i), CountEnd(i)) Then
      ' It's a match. Set the flag and exit inner loop
      bMatch = True
      Exit For
    End If
  Next i
  If bMatch Then
    ' We had a match so write the line to the output stream
    Console.WriteLine(Line)
  End If
End While

If I understood right, it should now do the job (i.e. output matching lines only).

Hi Teme64,

I can`t thank you enough.
The code works perfectly well.Only thing it dpesn`t is it doesn pick the first and the last String in the array I don`t understand why.The position and Values specified are correct , but still code passes them by and don`t write them in the file.
I was also thinking because the first and the last element in the array should always go respectively as first and last line in the new file, is there a way the code can do that like say UpperBound(0) alaways write as first line and the last element in the array alway as last row in the file.
Here is the full Code I have thanks to you :

Imports System
Imports System.IO
Imports System.Xml



Module Module1
    Public Sub ImportXML(ByVal FileName As String, ByRef Names() As String, _
 ByRef CountStart() As Integer, ByRef CountEnd() As Integer)
        Dim FStream As FileStream
        Dim XReader As System.Xml.XmlReader
        Dim XNav As XPath.XPathNavigator
        Dim XDoc As XPath.XPathDocument
        Dim XIter As XPath.XPathNodeIterator
        Dim TempStr As String
        Dim TempInt As Integer
        Dim TempCount As Integer


        TempCount = 0
        FStream = New FileStream(FileName, FileMode.Open, FileAccess.Read)
        XReader = New Xml.XmlTextReader(FStream)
        XDoc = New XPath.XPathDocument(XReader)
        XNav = XDoc.CreateNavigator()


        XNav.MoveToRoot()
        XIter = XNav.Select("descendant::Treaties/Treaty")
        While XIter.MoveNext
            ReDim Preserve Names(TempCount)
            ReDim Preserve CountStart(TempCount)
            ReDim Preserve CountEnd(TempCount)
            ' Name    
            TempStr = XIter.Current.SelectSingleNode("descendant::Name").Value
            Names(TempCount) = TempStr
            ' CountStartPosition    
            TempStr = XIter.Current.SelectSingleNode("descendant::CountStartPosition").Value
            If Integer.TryParse(TempStr, TempInt) Then
                CountStart(TempCount) = TempInt
            Else

                CountStart(TempCount) = -1
                ' -1 indicates invalid value    
            End If
            ' CountEndPosition    
            TempStr = XIter.Current.SelectSingleNode("descendant::CountEndPosition").Value
            If Integer.TryParse(TempStr, TempInt) Then
                CountEnd(TempCount) = TempInt
            Else
                CountEnd(TempCount) = -1
                ' -1 indicates invalid value    
            End If
            TempCount += 1


        End While

        FStream.Close()

    End Sub

    Public Function MatchString(ByVal OneLine As String, ByVal StringToMatch As String, ByVal StartPos As Integer, ByVal EndPos As Integer) As Boolean
        If String.IsNullOrEmpty(StringToMatch) OrElse String.IsNullOrEmpty(OneLine) Then

            Return False
        End If
        If StartPos < 0 OrElse StartPos >= EndPos Then


            Return False
        End If
        If StringToMatch.Length <> ((EndPos - StartPos) + 1) Then
            Return False
        End If
        If StartPos >= OneLine.Length OrElse EndPos > OneLine.Length Then
            Return False
        End If
        If String.Compare(OneLine.Substring(StartPos, (EndPos - StartPos) + 1), StringToMatch, True) = 0 Then


            Return True
        Else
            ' No match    
            Return False
        End If

    End Function




    Sub Main()

        Dim Names(0) As String
        Dim CountStart(0) As Integer
        Dim CountEnd(0) As Integer

        ImportXML("C:Strings.xml", Names, CountStart, CountEnd)

        Dim FileToSplit As New FileStream("C:\FileToSplit.txt", FileMode.Open)
        Dim stream As New StreamReader(FileToSplit)
        stream.BaseStream.Seek(0, SeekOrigin.Begin)
        
        Dim FileMatch As New FileStream("C:\FileMatch.txt", FileMode.OpenOrCreate)

        Dim filetosplitstream As New StreamWriter(FileMatch)

        Console.SetIn(stream)
        Console.SetOut(filetosplitstream)


        Dim i As Integer ' Loop counter
        Dim bMatch As Boolean
        Dim Line As String

        ' Indicates if we found a match
        While stream.Peek() > -1
            Line = Console.ReadLine()
            bMatch = False
            ' Loop xml file  
            For i = 0 To Names.GetUpperBound(0)
                ' Check for a match   
                If MatchString(Line, Names(i), CountStart(i), CountEnd(i)) Then
                    ' It's a match. Set the flag and exit inner loop      
                    bMatch = True
                    Exit For
                End If
            Next i
            If bMatch Then
                ' We had a match so write the line to the output stream    
                Console.WriteLine(Line)
            End If
        End While

        stream.Close()
        filetosplitstream.Close()

        

       
    End Sub




End Module

Thanks a million

The code works perfectly well.Only thing it dpesn`t is it doesn pick the first and the last String in the array I don`t understand why.The position and Values specified are correct , but still code passes them by and don`t write them in the file.

Actually I think I made few "off-by-one" errors in MatchString function. Here's a fixed function

Public Function MatchString(ByVal OneLine As String, ByVal StringToMatch As String, ByVal StartPos As Integer, ByVal EndPos As Integer) As Boolean
  If String.IsNullOrEmpty(StringToMatch) OrElse String.IsNullOrEmpty(OneLine) Then
    Return False
  End If
  If StartPos < 0 OrElse StartPos >= EndPos Then
    Return False
  End If
  If StringToMatch.Length <> (EndPos - StartPos) Then
    Return False
  End If
  If StartPos >= OneLine.Length OrElse EndPos > OneLine.Length Then
    Return False
  End If
  If String.Compare(OneLine.Substring(StartPos, (EndPos - StartPos)), StringToMatch, True) = 0 Then
    Return True
  Else
    ' No match    
    Return False
  End If

End Function

One other point. You have to add exception handling at least for file operations. Here's an example (taken from Sub Main)

' Add to ImportXML exception handler (file not found etc.)
' See the lines below
ImportXML("D:\test.xml", Names, CountStart, CountEnd)

Dim FileToSplit As FileStream
Try
  FileToSplit = New FileStream("D:\FileToSplit.txt", FileMode.Open)
Catch ex As Exception
  ' Couldn't open the input file. Notify user and/or exit!
  Exit Sub
End Try
Dim stream As New StreamReader(FileToSplit)
stream.BaseStream.Seek(0, SeekOrigin.Begin)

Dim FileMatch As FileStream
Try
  FileMatch = New FileStream("D:\FileMatch.txt", FileMode.OpenOrCreate)
Catch ex As Exception
  ' Couldn't open the output file. Notify user and/or exit!
  Exit Sub
End Try

Otherwise you'll have a very easily crashing application :)

If you still have the same problem, show the actual data you're using. Or at least some parts of it so I can use it for testing if needed.

N.B. I changed drive letters C -> D, you have to change them back...

Hi Teme64,
Thanks so much works perfectly now , old version of match function is ok.I needed to adjust my XML file , my fault.
Thank you very much for the great help and explanation .
I will mark the thread as solved.
Thanks again

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.