getting source of webpage

Question

sfrider0 6 Junior Poster

16 Years Ago

I've tried a couple of ways of getting the source of a page, but it doesn't return the whole thing. I'm trying to get it to return the the whole source like it does if you view it in a web browser. Mine just returns parts of it and leaves some spots out. Any ideas??

web-browser

2 Contributors
6 Replies
147 Views
1 Day Discussion Span
Latest Post 16 Years Ago Latest Post by sfrider0

All 6 Replies

Stinomus 11 Junior Poster

16 Years Ago

Well what have you done thus far?

Stinomus 11 Junior Poster

16 Years Ago

Try using:

System.Net.WebClient client = new WebClient();
client.DownloadFile(source, target);

where source is the webpage address and target is the local file.

Obviously that will download the file, but it certainly does it in it's entirety. Im not sure about the use of DownloadString on large webpages.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

sfrider0 6 Junior Poster · Answer 1 · 2009-05-27T11:32:52+00:00

I got the same result doing these two methods.

string getPageSource(string URL)
{
    System.Net.WebClient webClient = new System.Net.WebClient();
    string strSource = webClient.DownloadString(URL);
    webClient.Dispose();
    return strSource;
}

and

class WebFetch
{
	static void Main(string[] args)
	{
		StringBuilder sb  = new StringBuilder();

		byte[]        buf = new byte[8192];

		HttpWebRequest  request  = (HttpWebRequest)
			WebRequest.Create("http://www.mayosoftware.com");

		HttpWebResponse response = (HttpWebResponse)
			request.GetResponse();

		Stream resStream = response.GetResponseStream();

		string tempString = null;
		int    count      = 0;

		do
		{
			count = resStream.Read(buf, 0, buf.Length);

			if (count != 0)
			{
				tempString = Encoding.ASCII.GetString(buf, 0, count);

				sb.Append(tempString);
			}
		}
		while (count > 0); // any more data to read?

		Console.WriteLine(sb.ToString());
	}

sfrider0 6 Junior Poster · Answer 2 · 2009-05-27T12:50:24+00:00

Would the local file be a txt file? Sorry, I'm pretty new to downloading fjles.

Stinomus 11 Junior Poster · Answer 3 · 2009-05-28T05:03:38+00:00

Well it depends on what type of file you have selected to download. Assuming it is a webpage then yes it will just be a text file.

sfrider0 6 Junior Poster · Answer 4 · 2009-05-28T07:39:40+00:00

Thanks. I'm just trying pull all the links from a website. It doesn't seem to get the links that are in javascript. The links show up in the original source that the browser gives me, but leaves that whole section out of when I try to get it. I'm trying do this in a asp website. I seem to be able to get the links in regular C# if I use this

foreach (HtmlElement link in webBrowser1.Document.Links)
            {
                string linkItem = link.GetAttribute("HREF").ToString();

The HtmlElement is in the System.Windows.Forms, which isn't part of the ASP. Is this even go to be possible?

getting source of webpage

Recommended Answers Collapse Answers

All 6 Replies

Recommended Answers