alright so my exact problem is that when I attempt to visit a url stored in a text file I get the error
"URLError: <urlopen error no host given>" This is strange because if I type in the urls myself they work fine(opener.open("site.com")) The lines of code causing the error look something like this

for site in test.txt:
    opener = urllib2.build_opener()
        opener.addheaders = [('User-agent', 'Mozilla/5.0')]
        home = opener.open(site)

Recommended Answers

All 10 Replies

test.txt is a file,so you have to open it for read open('test.txt'):

snippsat yes I did that, but like I said the error came from those lines and as you can see the error had nothing to do with not being able to pull information from a variables or file.

This code work for me.

test.txt
*http://www.google.com/videohp
*http://www.python.org/

---

import urllib2

for site in open('test.txt'):
    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    home = opener.open(site)
    print home.geturl()

Check url's in test.txt,if i delete one /.
I get URLError: <urlopen error no host given>

that / after videohp is an invalid url. I tested your code several different ways. http://www.google.com/videohp/ doesn't open in my browser. http://www.google.com/videohp opens with urllib2 and a browser.

for site in lst_:
        try:
            home = opener.open(site)
        except urllib2.HTTPError as exHttp:
            print "Http error: " + site
        except urllib2.URLError as exUrl:
            print "Invalid url: " + site
        else:
            print home.geturl()

There is no problem in my code chriswelborn.
The point was to get @inuasha to check his test.txt for missing /.
When i teset,i did change http:// to http:/ to get samme error as he has.

It's not about the concrete example, but the way you have to approach this problem, and the way to solve it.

inuasha please post your test.txt file to clarify any doubts.

Lucaci Andrew I wouldn't...very large file. Anyways the urls are valid. they all have the correct amount of /'s they are preceded by http://www. and they all end with a / aswell. I have tried with out the ending / and the problem still ocurrs.

Do some exception handling to figure out what's going on.

import urllib2

for site in open('test.txt'):
    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    try:
        home = opener.open(site)
        print home.geturl()
    except Exception as error:
        print error

So here it will print all ok url's,if one is bad it will print <urlopen error no host given>
instead of that adress.

snippsat like I said...it will give that <urlopen error no host given> for every url. wheather it has a slash afterwards or not. I have tried both ways.
I told you what the error was and I checked to make sure that that is the error. I don't need help on my syntax I need help in the logic.

Ok the only think i can think of,is that the there is a problem with your test.txt.
As you say there is no problem if you writing in one url.

You can test with repr() to see all that test.txt is outputing.
So here google and sol url adress work,and i get <urlopen error no host given> on Python adress because of missing / this is when test run it in my code over.

for site in open('test.txt'):
    print repr(site)

"""Output--> repr
'http://www.google.com/videohp\n'
'http:/www.python.org/\n'
'http://www.sol.no/\n'
'\n'
"""

"""Output--> from code over
http://www.google.com/videohp
<urlopen error no host given>
http://www.sol.no/
unknown url type:
"""
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.