0

Guys I need help on how to extract data from this web page http://hidemyass.com/proxy-list/
Its mainly the Ip address and port but i have no idea in where to start. I know to start out with this
Dim elements As HtmlElementCollection = Me.botBrowser.Document.All
but i dont know how i would transverse the source code to find the ip address and port.
Also like if i just wanted to first one on the page each time the page refreshed how would i do this also

4
Contributors
7
Replies
12
Views
5 Years
Discussion Span
Last Post by fiaworkz
Featured Replies
  • See if this helps. [B]1 Button, 1 ListBox[/B] [CODE]Public Class Form1 Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click getHTML("http://hidemyass.com/proxy-list/") End Sub Private myWebResponse As Net.HttpWebResponse Private myStream As IO.Stream Private myReader As IO.StreamReader Private Sub getHTML(ByVal siteURL As String) Me.Cursor = Cursors.WaitCursor Try myWebResponse = … Read More

0

Hi,

I'd use one of the many free scren scraper tools there are out there and save myself the bother.... you can get them to export to csv or xml and take the file into read....

1

See if this helps.
1 Button, 1 ListBox

Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        getHTML("http://hidemyass.com/proxy-list/")
    End Sub

    Private myWebResponse As Net.HttpWebResponse
    Private myStream As IO.Stream
    Private myReader As IO.StreamReader

    Private Sub getHTML(ByVal siteURL As String)
        Me.Cursor = Cursors.WaitCursor
        Try
            myWebResponse = CType(Net.HttpWebRequest.Create(siteURL).GetResponse, Net.HttpWebResponse)
            myStream = myWebResponse.GetResponseStream()
            myReader = New IO.StreamReader(myStream)
            extractHTML(myReader.ReadToEnd, ListBox1)
            myReader.Close()
            myStream.Close()
            myWebResponse.Close()
        Catch ex As Exception
            MsgBox("There was a connection problem.", MsgBoxStyle.Critical)
        End Try
        Me.Cursor = Cursors.Default
    End Sub

    Private iSi, iEi As Integer, arTemp(), sTemp, sItemToAddToListBox As String

    Private Sub extractHTML(ByVal htmlContent As String, ByVal selListbox As ListBox)
        selListbox.Items.Clear()
        With htmlContent
            iSi = .IndexOf("<td>IP address</td>")
            iEi = .IndexOf("</table>", iSi)
            arTemp = .Substring(iSi, iEi - iSi).Split("/"c)
        End With
        sTemp = "<td><span>"
        For i As Integer = 0 To arTemp.Length - 1
            With arTemp(i)
                If .ToLower.Contains(sTemp) Then
                    sItemToAddToListBox = .Substring(.IndexOf(sTemp) + sTemp.Length).Replace("<", "")
                    sItemToAddToListBox &= ":" & arTemp(i + 2).Substring(.IndexOf("<td>") + 5).Replace("<", "")
                    selListbox.Items.Add(sItemToAddToListBox)
                End If
            End With
        Next
        MsgBox("done")
    End Sub
End Class

Edited by codeorder: n/a

Votes + Comments
Amazingly Helpful
0

Ty so much, im havng it refresh with a timer every 60 seconds is their a way to completely erase everything in the list box and refresh it with the new stuff it picked up?

0

codeorder can you help me get the ip address and port in a variable im having trouble transverse the itemlist

0

Since "arTemp" is already declared in my previous code, use this.

Private Sub ListBox1_SelectedIndexChanged(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles ListBox1.SelectedIndexChanged
        With ListBox1
            If Not .SelectedIndex = -1 Then
                arTemp = .Items(.SelectedIndex).ToString.Split(":"c) '// .Split item in 2 Arrays.
                MsgBox(arTemp(0)) '// IP.
                MsgBox(arTemp(1)) '// Port.
            End If
        End With
    End Sub
-1

Seems this will not work now because HMA recently had some changes in its source

See if this helps.
1 Button, 1 ListBox

Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        getHTML("http://hidemyass.com/proxy-list/")
    End Sub

    Private myWebResponse As Net.HttpWebResponse
    Private myStream As IO.Stream
    Private myReader As IO.StreamReader

    Private Sub getHTML(ByVal siteURL As String)
        Me.Cursor = Cursors.WaitCursor
        Try
            myWebResponse = CType(Net.HttpWebRequest.Create(siteURL).GetResponse, Net.HttpWebResponse)
            myStream = myWebResponse.GetResponseStream()
            myReader = New IO.StreamReader(myStream)
            extractHTML(myReader.ReadToEnd, ListBox1)
            myReader.Close()
            myStream.Close()
            myWebResponse.Close()
        Catch ex As Exception
            MsgBox("There was a connection problem.", MsgBoxStyle.Critical)
        End Try
        Me.Cursor = Cursors.Default
    End Sub

    Private iSi, iEi As Integer, arTemp(), sTemp, sItemToAddToListBox As String

    Private Sub extractHTML(ByVal htmlContent As String, ByVal selListbox As ListBox)
        selListbox.Items.Clear()
        With htmlContent
            iSi = .IndexOf("<td>IP address</td>")
            iEi = .IndexOf("</table>", iSi)
            arTemp = .Substring(iSi, iEi - iSi).Split("/"c)
        End With
        sTemp = "<td><span>"
        For i As Integer = 0 To arTemp.Length - 1
            With arTemp(i)
                If .ToLower.Contains(sTemp) Then
                    sItemToAddToListBox = .Substring(.IndexOf(sTemp) + sTemp.Length).Replace("<", "")
                    sItemToAddToListBox &= ":" & arTemp(i + 2).Substring(.IndexOf("<td>") + 5).Replace("<", "")
                    selListbox.Items.Add(sItemToAddToListBox)
                End If
            End With
        Next
        MsgBox("done")
    End Sub
End Class

Edited by fiaworkz: typo

Votes + Comments
.next time,start a thread and link .Me to it.reason:provided.solution.
This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.