Start New Discussion within our Software Development Community

Okay so alittle while ago I decided I wanted to write a program that would read in data from a website (the HTML data) and then use it from there (long story short I was to read in a list of urls, then access each of those and gather data from them).

So I used a code I copied from the web and tweaked

public void gatherWebsiteDate (string input)
      string urlAddress = input;
      HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
      HttpWebResponse response = (HttpWebResponse)request.GetResponse();

      if (response.StatusCode == HttpStatusCode.OK)
        Stream receiveStream = response.GetResponseStream();
        StreamReader readStream = null;

        if (response.CharacterSet == null)
          readStream = new StreamReader(receiveStream);
          readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
        string data = readStream.ReadToEnd();

        richTextBox1.Text = data;


Then I ran my code, and while it did read the HTML data of the page it refuses to read the data I want.

Here's a page I would read the data from

Once on this page you see there are a list of members, by clicking on their name you can then got to their profile page and see data about the player (in this case I wanted to gather a clan's data and what tanks each member has used, I would use the collection or URLs to view their profile page and collect this data).

The data I am looking for just won't show up.

The HTML code looks like this

<tbody id="member_table_container">
                      <tr class="js-template js-hidden">
                            <td class="number t-number"></td>
                            <td class="name t-name b-user"></td>
                            <td class="role js-role"></td>
                            <td class="member_since"></td>
                <tr class="odd clan-role-recruit">
                            <td class="number t-number">1</td>
                            <td class="name t-name b-user js-rendered-template"><a href="/uc/accounts/1001157039-afbrad/">afbrad</a></td>
                            <td class="role js-role js-rendered-template">Recruit</td>
                            <td class="member_since js-rendered-template">16.08.2011</td>

but all I get is

<tbody id="member_table_container">
                      <tr class="js-template js-hidden">
                            <td class="number t-number"></td>
                            <td class="name t-name b-user"></td>
                            <td class="role js-role"></td>
                            <td class="member_since"></td>

For some reason it doesn't copy over any of the users data, it just doesn't see it. I have no clue why it's doing this, but this is what I need the program to do. No it's not hidden if I log out I can still view all the clan data fine, so it's not limited to log in credentials.

Hopefully this makes sense, I kind of am having trouble finding my words today, but hopefully someone can help me out here

It appears that the code you are looking for is generated by the javascript in the page, thus it isn't part of the HTML. When I use View Source in Firefox, it only shows the bottom HTML you have, nothing about each of the players.

See when I view it in Firefox I see all the code (well I view it with Firebug). If this is indeed the case is there way I can still aquire the data?

Edit I was also able to view all the date in my IE browser (with I hit F12)

Anyone? I really can't seem to figure this one out and would really like to get this program to work

If it's any help I usually tend to use the WebBrowser control for these things, and use the "DocumentCompleted" event handler to work with the read the "Document" property of the WebBrowser.

Well good news and bad news ...

Good news, this does work, at least I believe it did cause the bad news is why I said I think ...

I got this lovely message "Sorry, this browser does not support this site."

Anyone got any idea? Amyway to trick the system? Anything I am over looking? Could really use some help I thought I was making some progress, but I guess not

Okay so the user agent could be a possibility ... but how do I implement that into WebBrowser? Remember using HTTPWebRequest doesn't allow me to read the generate code (it only reads the source ... unless I am overlooking something)

You can read ALL the HTML easily using the following method:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
//Add using System.Net;
using System.Net;

namespace WindowsFormsApplication1
    public partial class Form1 : Form
        public Form1()

        private void Form1_Load(object sender, EventArgs e)
            //Creates the WebClient
            WebClient client = new WebClient();
            //Downloads the HTML
            String htmlCode = client.DownloadString("");
            //Makes the TextBox Multiline and Make the Vertical Scroll Bar Active
            textBox1.ScrollBars = ScrollBars.Vertical;
            textBox1.Multiline = true;
            //Makes the TextBox Dock to the Form
            textBox1.Dock = DockStyle.Fill;
            //Displays the Text in a TextBox
            textBox1.Text = htmlCode;

Hope this helps and happy coding!

I think there are plenty of solutions here, I hope the thread owner closes this thread soon or tells us about a problem.

He helped me threw this problem alot (and since it will be awhile before I could get back to it I decided to answer it solved and give this guy some Kudos for it)

I will close it for now actually as I do believe I have enough to work with

Pseudorandom21 I do think your UserAgent might be the magical piece of code I needed (lol bad use of words I know, I am tired so I have a dry humor).

I need to contact someone about writing the actual string for the User Agent part.

Thanks for all the help to

This question has already been answered. Start a new discussion instead.