0

I noticed a weird thing happening when i use this function. If the page i open is in latin-1 encoding, the bytes returned by this function would have some weird junk characters inserted in various places.

However, if i use urlretrieve to fetch the page to disk, there is no junk in the resulting file.

Ideas?

EDIT: I decoded the bytes returned by urllib.request.urlopen with the latin-1 encoding and saved it to a file for comparision; this is how i know the junk data is there.

1
Contributor
1
Reply
2
Views
8 Years
Discussion Span
Last Post by scru
This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.