I noticed a weird thing happening when i use this function. If the page i open is in latin-1 encoding, the bytes returned by this function would have some weird junk characters inserted in various places.

However, if i use urlretrieve to fetch the page to disk, there is no junk in the resulting file.

Ideas?

EDIT: I decoded the bytes returned by urllib.request.urlopen with the latin-1 encoding and saved it to a file for comparision; this is how i know the junk data is there.