I noticed a weird thing happening when i use this function. If the page i open is in latin-1 encoding, the bytes returned by this function would have some weird junk characters inserted in various places.

However, if i use urlretrieve to fetch the page to disk, there is no junk in the resulting file.

Ideas?

EDIT: I decoded the bytes returned by urllib.request.urlopen with the latin-1 encoding and saved it to a file for comparision; this is how i know the junk data is there.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.