I've been trying to create a program that gets wikipedia pages, and lists all the links found in the page source. I've used the urllib.urlopen() method to do this, and unfortunately I've run into a little problem. Instead of getting the actual page like say the main page, or any article, i get something else. Which can be summarized by:

"<p>Our servers are currently experiencing a technical problem. This is probably temporary and should be fixed soon. Please <a href="http://en.wikipedia.org/wiki/Leg_spin" onclick="RefreshPage(); return false">try again</a> in a few minutes.</p>
<p>You may be able to get further information in the <a href="irc://chat.freenode.net/wikipedia">#wikipedia</a> channel on the <a href="http://www.freenode.net">Freenode IRC network</a>.</p>"

Unfortunately, this keeps on happening regardless of the time or article. Is there any way i can fix this?

Edited by Tommy_101: grammar mistakes

7 Years
Discussion Span
Last Post by Tommy_101

Alright guys, thanks to anyone that even looked at my post to see if they could help. I used urllib2 (well, urllib in python 3.0, same thing) and it gave me a runtime error with the message "html error 403, access is forbidden" or something along those lines. Did some research and realized that some websites don't want you to access their content without a browser. Which leads to my next problem of having to simulate a browser for wiki.

Thanks guys

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.