Hello, I'm sorry if this is the wrong place to post this, I couldn't find a support section.
Anyway, I'm new to VB.NET and honestly, I'm lost. I need to extract text from a HTML page, here's the line I want to extract from
<p><a href="/video7419001/0/CoD4">CoD4</a></p>
What I want to extract from it is this
<p><a href="/video(This here)/0/(This here)">CoD4</a></p>
All I want to do is put it into two listboxes, could I do this without using Regex? I've had a look at Regex and it's mind blowing with these "wild cards" I honestly don't know where to start lol.
Thanks for the advice :)

Edited by eatyourgreens: typo

4 Years
Discussion Span
Last Post by Hiroshe

With some string manipulation this must also be possible, if you always have the same format in your html. Have a look at the String class in MSDN. Split and Substring methods are the first that come in my mind. Success!


This might not be what you're looking for, but I would suggest regex's.

1) They're fast.
2) They're very powerfull. They're the powerhouse of text manupulation, making things even like bioinformatics easy.

Just take your time in learning regex's. There not too hard, just don't get scared by them.

Being said, yes there are ways around it. You can implement your own string searching algorithm (this is quite a bit of work mind you). Perhaps something like KMP


Thanks for the replies, it helped alot. I finally got all the stuff into a list box that I want. This is what I got /video7419001/0/CoD4 now to the next bit, is it possible to extract this here /video(these digits here)/0/CoD4into another listbox? they all range from 1000 - 99999999
Thanks :)


You can use another regex to read the first number. "[0-9]+" in perl, so probably something simular in vb.net.


You can do the following:
Add Imports System.Text.RegularExpressions

Create a class named "VideoInfo.vb".


Public Class VideoInfo
    Public Property videoNumber As String
    Public Property name As String

    Public Sub New()

    End Sub

    Public Sub New(ByVal videoNumber As String, ByVal name As String)
        Me.videoNumber = videoNumber
        Me.name = name
    End Sub

End Class

We will use the above class inside the following function.

    Private Function extractData(ByVal myData As String) As List(Of VideoInfo)

        'create new list for video info
        Dim videoInfoList As New List(Of VideoInfo)

        'pattern that we want to find
        'use named groups
        Dim pattern As String = "/video(?<vidNumber>[0-9]+)/[0-9]?/(?<vidName>[A-Za-z0-9]+)"

        'look for a match
        Dim myMatch As Match = Regex.Match(myData, pattern)

        'keep looking for matches until no more are found
        While myMatch.Success

            'get groups
            Dim myGroups As GroupCollection = myMatch.Groups

            'store extracted info in an instance of VideoInfo
            Dim myVideoInfo As New VideoInfo
            myVideoInfo.videoNumber = myMatch.Groups("vidNumber").Value
            myVideoInfo.name = myMatch.Groups("vidName").Value

            'add video info to the list

            'get next match
            myMatch = myMatch.NextMatch()
        End While

        Return videoInfoList
    End Function

To use it:

        Dim myData As String = String.Empty
        Dim myVideoInfo As List(Of VideoInfo)
        Dim output As String = String.Empty

        myData += "<p><a href=""/video7419004/0/CoD4"">CoD4</a></p>"
        myData += "<p><a href=""/video7419005/0/CoD5"">CoD5</a></p>"
        myData += "<p><a href=""/video7419006/0/CoD6"">CoD6</a></p>"

        myVideoInfo = extractData(myData)

        For Each video In myVideoInfo
            output += "video: " & video.videoNumber & " " & video.name
            output += System.Environment.NewLine

        'display for testing purposes


Regex: Named Capturing Groups in .NET

Regular Expressions in C# – Part 2 – Matches and NextMatch

Edited by cgeier


Why was my post down-voted? It is a tested/working solution. If you disagree with my solution, please provide explanation.


It wasn't me who downvoted but as a guess, I'd say it was an overly complex solution to a simple problem. In my opinion that didn't justify a downvote so I upvoted to cancel it.

Edited by Reverend Jim


My guess is because you're giving free code away. Kind of like the saying "give a man a fish, feed him for a day, teach a man how to fish, feed him for a lifetime."

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.