I have written a script for scraping a web site, and it works fine. What does not work fine, is when I try to use the write function, to write the results to a txt-file.
I am trying to run this:
import BeautifulSoup, urllib2, re, time
import codecs
path='C:/Users/Me/Documents/Python'
outfile=open(r'C:/Users/Steinar/Documents/Python/Vegvesen/vegresultat.txt', 'a')
start_url = "http://www.vegvesen.no/Om+Statens+vegvesen/Aktuelt/Offentlig+journal?dokumenttyper=&dato=10.02.2003&journalenhet=&utforSok=S%C3%B8k&submitButton=S%C3%B8k"
datos = (
'01.11.2008',
'02.11.2008',
)
for dato in datos:
search_url = "http://www.vegvesen.no/Om+Statens+vegvesen/Aktuelt/Offentlig+journal?dokumenttyper=&dato=%s&journalenhet=&utforSok=S%%C3%%B8k&submitButton=S%%C3%%B8k" % dato
page = urllib2.urlopen(search_url)
html = page.read()
soup = BeautifulSoup.BeautifulSoup(html)
divs = soup.findAll("div", {"class": "treff"})
for div in divs:
outfile.write (dato + '|' + div.p.contents[0])
pass
outfile.close()
I get this error message:
outfile.write (dato + '|' + div.p.contents[0])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 25: ordinal not in range(128)