Basic example with for loop
#URL LIBRARY from urllib2 import * ur = urlopen("http://www.daniweb.com/forums/thread161312.html")#open url contents = ur.readlines()#readlines from url file fo = open("test.txt", "w")#open test.txt for line in contents: print "writing %s to a file" %(line,) fo.write(i)#write lines from url file to text file fo.close()#close text file
Thanks for the help. That solved my problem.
How to remove all the html tags?
urlopen() does not seem to work for me, as in I cannot import it. I am using Python 3.4.3 though.
Here are the diffrent ways,
and also what i would call the prefered way these day with Requests.
from urllib2 import urlopen page_source = urlopen("http://python.org").read() print page_source
from urllib.request import urlopen page_source = urlopen('http://python.org').read().decode('utf_8') print(page_source)
For Python 3 to get
str output and not
byte we need to decode to utf-8.
Here with Requests,work for Python 2 and 3:
import requests page_source = requests.get('http://python.org') print(page_source.text)
import requests from bs4 import BeautifulSoup page_source = requests.get('http://python.org') soup = BeautifulSoup(page_source.text) print(soup.find('title').text) #--> Welcome to Python.org