Proper Way To Refresh WebBrowser (and clear cache)

Question

JOSheaIV 119 C# Addict

11 Years Ago

Hello DaniWebbers,

So I have been hard at work on a new program that reads in a webpage every X amount of seconds. Once doing so it detects if there has been a change on the webpage, if so, updates a form, and let me know of the change.

Well I have been running into a snag. I recently got a webpage reader class that works perfect for me ... or so I thought. I have recently come to learn that WebBrowser stores a cache of recent visited sites and if it detects the same sight it will access it from the cache (if it hasn't expired).

One of the webpages I have been test this code on, is causing a problem. The webpage will update with new data, but my WebBrowser keeps reading in the old data from it's cache.

Here's my code

namespace ScoreTableDetector_2v2
{
//===================================================================================================================
    class readInWebpage_v3 : IDisposable
    {
        WebBrowser wb;
        bool timerTriggered;
//-------------------------------------------------------------------------------------------------------------------
        public readInWebpage_v3 ()
        {
            wb = new WebBrowser();
            wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);

            timerTriggered = false;
        }
//-------------------------------------------------------------------------------------------------------------------
        public string downloadedData
        {
            get;
            private set;
        }
//-------------------------------------------------------------------------------------------------------------------
        public void readIn (Uri webLink, int secs)
        {
            DateTime timeNow = DateTime.Now;

            wb.Navigate(webLink);
            //wb.Refresh(WebBrowserRefreshOption.Normal); //DOESN'T fix problem

            TimeSpan elapsedTime;
            while (wb.ReadyState != WebBrowserReadyState.Complete)
            {
                elapsedTime = DateTime.Now - timeNow;
                if (elapsedTime.Seconds > secs) //This function does indeed work for a timed out (and supports Application.DoEvents() which seems to be needed for wb_DocumentCompleted)
                {
                    timerTriggered = true;
                    break;
                }
                Application.DoEvents();
            }

            if (timerTriggered == false)
            {
                //downloadedData = wb.Document.Body.InnerHtml; //Added this line, because the final HTML takes a while to show up 
                    //!!!! This seems redundent so hold up on it

                if (this.downloadedData.Contains("Navigation to the webpage was canceled")) //The URL lead to an invalid webpage
                {
                    downloadedData = "Invalid Webpage";
                }
            }
            else //timed out
            {
                wb.Stop();
                downloadedData = "Timed Out";
            }

            //wb.Dispose();
        }
//-------------------------------------------------------------------------------------------------------------------
        void wb_DocumentCompleted (object sender, WebBrowserDocumentCompletedEventArgs e) //when the webpage has finished loading (read it)
        {
            WebBrowser webBrows = (WebBrowser) sender;
            downloadedData = webBrows.Document.Body.InnerHtml;
        }
//-------------------------------------------------------------------------------------------------------------------
        public void Dispose () //used for disposing items
        {
            if (wb != null)
            {
                wb.Dispose();
                wb = null;
            }
            if (downloadedData != null)
            {
                downloadedData = "";
            }
        }
//-------------------------------------------------------------------------------------------------------------------
    }
//===================================================================================================================
}

Now I read online about the cache WebBrowser has (actually to be honest at first I guessed it did that, what do you know a lucky guess), and that using the Refresh() command is suppose to force the page to re-read the webpage in it's current state.

Well I tried this and no matter what I do I can't get it to work for me. I tried at one point adding in a cold start if() statement. If the class was called the first time the wb.Navigate() would be called and everytime after the wb.Refresh() would be (instead of Navigate). I tried different Refreshes to, not just the commented out one above. But no matter what I tried, when I used the Refresh on its own never called the DocumentCompleted event (which I kind of rely on).

So what I am trying to figure out is how can I force my WebBrowser to constantly gather new data from the webpage and stop relying on the cache?

Oh yeah this is how I call the class

readInData = new readInWebpage_v3();
readInData.readIn(webpageURI, refreshTimer);

//does a bunch of stuff with the data

readInData.Dispose();

That's all tucked into a backgroundWorker_ProgressChanged (I was hoping disposing and recreating the WebBrowser would prove successful but it doesn't). I also know that there are sites this does work fine on, but the site I am on it doesn't and that's what I need it to work for.

Thanks in advance for any help

3 Contributors
4 Replies
7K Views
2 Days Discussion Span
Latest Post 11 Years Ago Latest Post by JOSheaIV

All 4 Replies

Momerath 1,327 Nearly a Senior Poster

11 Years Ago

You might want to consider using WebClient or WebRequest/WebResponse instead of WebBrowser. Not sure if they will meet your needs, depends on what you are looking at :)

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

JOSheaIV 119 C# Addict · Answer 1 · 2012-10-31T23:38:33+00:00

Okay so knew I saw WebClient before, I used it in the first code I built to retrieve data from a webpage. I also stopped using it because WebClient would download the data right away from the page without giving it time to load (I think it's running like a jQuery or something like that ... either way data is loaded). And for the love of god I can't seem to find a way to suspend WebClient like I did in the WebBrowser above.

I looked into the Request and Respond a bit (not much) but from what I have found it has the same issue WebClient has.

I need to find a way to allow the page to finish loading before I retrieve it's data (this program isn't meant to work around just one webpage but be able to work all around). I test WebClient with this page

http://worldoftanks.com/uc/clans/1000000954-SAC/

If you try to use WebClient flat out, you'll get some data, but if you look at the page there is a clan roster, that seems to generate. WebBrowser does read this. So while WebClient should be able to fix the issue I have with the cache, it doesn't allow the page to load ... at least from what I have found.

So I am at right now, finding a way to make WebClient wait (like the code above for WebBrowser), or find a way to clear the cache for WebBrowser

Mike Askew 131 Veteran Poster Featured Poster · Answer 2 · 2012-11-01T09:39:18+00:00

Mike Askew 131 Veteran Poster

11 Years Ago

This will be of help with clearing a WebBrowser's cache.

JOSheaIV 119 C# Addict · Answer 3 · 2012-11-02T01:38:13+00:00

Hey Mikey I actually saw that the other day when searching the web, and I have it noted, however from what I read it clears all the IE cache, and I'm not sure if that's what I want to happen.

However I did find 2 solutions so far

The first was to use the following line

wb.Navigate(webLink + "?refreshToken=" + Guid.NewGuid().ToString());

From what I read it pretty much randomizes up the URL somehow to the point where the cache doesn't see it matching. I tried it multiple times last night and it works successfully ... however I really don't know how it entirely works (I'm willing for some help there).

The other solution, which might be something I implement in the future, is to use a .NET library for cURL. It exists and everything, and might be more successful in the future (espeically since I need to find a way to log into webpages programmically to)

I guess I can marked this solved for now, but please if someone sees something off or wrong let me know.

Proper Way To Refresh WebBrowser (and clear cache)

Recommended Answers Collapse Answers

All 4 Replies

Recommended Answers