Hello VB.net'ers.

I've got a problem. I have a feeling it is an easy one, but for the life of me i cant figure it out.
My project at the moment is to grab content from a website and use it in some way (havent worked out this part yet, still trying to get the data).

Currently, i have grabbed the whole website and written it to a text file.. No dramas there.
Then i read each line until i find a certain string... Again, no dramas.
MY problem is; after reading and finding the certain string, i want the program to grab whatever "characters" there are until it reads a certain string again, and then put those characters into a string for later use.

so for example:
READ WEBSITE...
CREATE TXT FILE...
WRITE TO TXT FILE...
READ EACH LINE IN TXT FILE UNTIL "abc" IS FOUND...
"abc" IS FOUND...
READ THE 3 CHARACTERS AFTER "abc"... << this is the part im struggling at
WRITE THESE CHARACTERS TO A STRING...
USE STRING FOR WHATEVER PURPOSE...

Any help is appreciated!! :)

Regards
Gobble45

Do you know anything about Regular expressions (Regex) or substrings?

Module Module1
   Sub Main()
      Dim strThingToFind = "abc"
      Dim strData As String = "there once was an abcklm in the basement"
      Dim intOffset = strData.IndexOf(strThingToFind)
      If (Not (intOffset = -1)) Then
         If (strData.Length > ((intOffset + strThingToFind.Length) + 3)) Then
            Console.WriteLine(strData.Substring(intOffset + 3, 3))
         End If
      End If
   End Sub
End Module

Edited 4 Years Ago by thines01: Added example

...better example:

Imports System.Text.RegularExpressions

Module Module1
   Function FindBySubstring(ByVal strThingToFind As String, ByVal strSource As String) As String
      Dim strRetVal As String = ""
      Dim intOffset = strSource.IndexOf(strThingToFind)
      If (Not (intOffset = -1)) Then
         If (strSource.Length > ((intOffset + strThingToFind.Length) + 3)) Then
            strRetVal = strSource.Substring(intOffset + 3, 3)
         End If
      End If
      Return strRetVal
   End Function
   Function FindByRegex(ByVal strThingToFind As String, ByVal strSource As String) As String
      Dim strRetVal As String = ""
      Dim rxNext3 = New Regex(strThingToFind + "(?<next3>.{3})")
      If (rxNext3.IsMatch(strSource)) Then
         strRetVal = rxNext3.Match(strSource).Groups("next3").Value
      End If
      Return strRetVal
   End Function
   Sub Main()
      Dim strThingToFind = "abc"
      Dim strData As String = "there once was an abcklm in the basement"
      Console.WriteLine(FindByRegex(strThingToFind, strData))
      Console.WriteLine(FindBySubstring(strThingToFind, strData))
   End Sub
End Module

Okay, thats excellent i will certainly try these as soon as i can, but in the mean time, what if i wanted to find all the characters up until a certain point.
Given the fact it could be anywhere between 1 and 4 characters.

All you would need to do at that point would be to reverse the math.
Find the "abc" offset then take the substrIng from 0 to that offset.

Im still struggling with this one quite a bit..
an extract of the string im trying to find characters in is:

<a href="http://www.cowboys.com.au/" target="_blank"><img src="/portals/nrl/images/0/8_17_1.png" width="18" height="18" alt="" /></a></span><a href="http://www.cowboys.com.au/" target="_blank">Cowboys</a></div><div>0</div><div>0</div><div>0</div><div>0</div><div>0</div><div>0</div><div>0</div><div>0</div><div>0</div><div>0-0</div><div>0-0</div><div>0</div><div>0</div><div>0</div><div>0</div><div>0</div><div>0</div><div>0</div></div>

This data is pulled from a website into a text file, and then i am reading from the textfile. Over time, these 0's are going to change, based on how the different teams in the NRL play..
ive tried many substring combinations, but can never work out how to grab JUST the digit(s).

have you checked out if you can get the data via a web service in XML, it will be far simpler and more robust to process it that way.

Yes. this was my first idea.. but i cannot find it anywhere on the site.
site is here
I had a thought to read through the txt file until i found a certain point, once found, it deletes all that precedes it, and then use substring'ing to find out the content.. but as my Visual Basic Express has just needed a reinstall, i cant test until i get it going again.

Use this to get html.content.

And extract data you need with this.

Dim sHtmlContent = "READ THE 3 CHARACTERS AFTER ""abc""... << this is the part im struggling at"
        Dim sToFind As String = "abc"
        With sHtmlContent
            If .Contains(sToFind) Then
                sHtmlContent = .Substring(.IndexOf(sToFind) + sToFind.Length)
                '  sHtmlContent = .Substring(.IndexOf(sToFind) + sToFind.Length, 9) '// 9 for length of data you need to extract.
                MsgBox(sHtmlContent)
            End If
        End With

hello ,
try this to get 3 characters after abc

Dim FStr As String
        FStr = txtFind.Text.Substring(Val(txtUsername.Text.IndexOf("abc")), 6).ToString
        txt2.Text = FStr.Substring(3, 3).ToString

txtfind is a textbox having text in which i want to find 3 characters after abc , and txt2 is a textbox which will show the next 3 characters after abc in string.
Hope this will helps you .

Regards

You can tell by the responses that knowing this was from an HTML source was critical information.
Is there a page that can be access directly or is it only available in disk file form? If it can be read directly from the web, this will change again.

With the current example, I created this snippet (and attached output screenshot):

Imports System
Imports System.IO
Imports System.Text.RegularExpressions

Module Module1
   Function GetArrayOfScores(ByVal strHtmlFile As String) As String()
      Dim strData As String =
         Regex.Replace(File.ReadAllText(strHtmlFile), "\s", "")

      Return _
         strData.Substring(
            strData.IndexOf("<div>")).Replace("div", "") _
               .Split("></".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
   End Function
   Sub Main()
      Dim arr_strScores = GetArrayOfScores("../../scores.htm")

      For Each s As String In arr_strScores
         Console.WriteLine(s)
      Next
   End Sub ' put a breakpoint here, if pause is needed
End Module
Attachments Output.jpg 13.37 KB
This article has been dead for over six months. Start a new discussion instead.