943,740 Members | Top Members by Rank

Ad:
  • Python Discussion Thread
  • Marked Solved
  • Views: 1385
  • Python RSS
Sep 17th, 2009
0

Python help, accessing wikipedia pages

Expand Post »
Hi,

I've been trying to create a program that gets wikipedia pages, and lists all the links found in the page source. I've used the urllib.urlopen() method to do this, and unfortunately I've run into a little problem. Instead of getting the actual page like say the main page, or any article, i get something else. Which can be summarized by:

"<p>Our servers are currently experiencing a technical problem. This is probably temporary and should be fixed soon. Please <a href="http://en.wikipedia.org/wiki/Leg_spin" onclick="RefreshPage(); return false">try again</a> in a few minutes.</p>
<p>You may be able to get further information in the <a href="irc://chat.freenode.net/wikipedia">#wikipedia</a> channel on the <a href="http://www.freenode.net">Freenode IRC network</a>.</p>"

Unfortunately, this keeps on happening regardless of the time or article. Is there any way i can fix this?
Last edited by Tommy_101; Sep 17th, 2009 at 8:08 pm. Reason: grammar mistakes
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
Tommy_101 is offline Offline
5 posts
since Jul 2008
Sep 19th, 2009
0

Re: Python help, accessing wikipedia pages

Alright guys, thanks to anyone that even looked at my post to see if they could help. I used urllib2 (well, urllib in python 3.0, same thing) and it gave me a runtime error with the message "html error 403, access is forbidden" or something along those lines. Did some research and realized that some websites don't want you to access their content without a browser. Which leads to my next problem of having to simulate a browser for wiki.

Thanks guys
Reputation Points: 10
Solved Threads: 0
Newbie Poster
Tommy_101 is offline Offline
5 posts
since Jul 2008

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Python Forum Timeline: First attempt at raw_input
Next Thread in Python Forum Timeline: Mechanics of a python backend





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC