URLError when reading from txt file

Question

inuasha 0 Newbie Poster

11 Years Ago

alright so my exact problem is that when I attempt to visit a url stored in a text file I get the error
"URLError: <urlopen error no host given>" This is strange because if I type in the urls myself they work fine(opener.open("site.com")) The lines of code causing the error look something like this

for site in test.txt:
    opener = urllib2.build_opener()
        opener.addheaders = [('User-agent', 'Mozilla/5.0')]
        home = opener.open(site)

iterate python text urllib2

Edited 11 Years Ago by inuasha

4 Contributors
10 Replies
273 Views
2 Days Discussion Span
Latest Post 11 Years Ago Latest Post by snippsat

All 10 Replies

snippsat 661 Master Poster

11 Years Ago

test.txt is a file,so you have to open it for read open('test.txt'):

snippsat 661 Master Poster

11 Years Ago

This code work for me.

test.txt
*http://www.google.com/videohp
*http://www.python.org/

---

import urllib2

for site in open('test.txt'):
    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    home = opener.open(site)
    print home.geturl()

Check url's in test.txt,if i delete one /.
I get URLError: <urlopen error no host given>

Edited 11 Years Ago by snippsat

chriswelborn 63 ...

11 Years Ago

that / after videohp is an invalid url. I tested your code several different ways. http://www.google.com/videohp/ doesn't open in my browser. http://www.google.com/videohp opens with urllib2 and a browser.

for site in lst_:
        try:
            home = opener.open(site)
        except urllib2.HTTPError as exHttp:
            print "Http error: " + site
        except urllib2.URLError as exUrl:
            print "Invalid url: " + site
        else:
            print home.geturl()

Edited 11 Years Ago by chriswelborn because: inserted code snippet for analysis..

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

inuasha 0 Newbie Poster · Answer 1 · 2013-02-03T07:32:53+00:00

snippsat yes I did that, but like I said the error came from those lines and as you can see the error had nothing to do with not being able to pull information from a variables or file.

snippsat 661 Master Poster · Answer 2 · 2013-02-03T16:21:27+00:00

There is no problem in my code chriswelborn.
The point was to get @inuasha to check his test.txt for missing /.
When i teset,i did change http:// to http:/ to get samme error as he has.

Lucaci Andrew 140 Za s|n · Answer 3 · 2013-02-03T17:00:48+00:00

It's not about the concrete example, but the way you have to approach this problem, and the way to solve it.

inuasha please post your test.txt file to clarify any doubts.

inuasha 0 Newbie Poster · Answer 4 · 2013-02-04T04:15:55+00:00

Lucaci Andrew I wouldn't...very large file. Anyways the urls are valid. they all have the correct amount of /'s they are preceded by http://www. and they all end with a / aswell. I have tried with out the ending / and the problem still ocurrs.

snippsat 661 Master Poster · Answer 5 · 2013-02-04T10:09:45+00:00

Do some exception handling to figure out what's going on.

import urllib2

for site in open('test.txt'):
    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    try:
        home = opener.open(site)
        print home.geturl()
    except Exception as error:
        print error

So here it will print all ok url's,if one is bad it will print <urlopen error no host given>
instead of that adress.

inuasha 0 Newbie Poster · Answer 6 · 2013-02-05T02:49:08+00:00

snippsat like I said...it will give that <urlopen error no host given> for every url. wheather it has a slash afterwards or not. I have tried both ways.
I told you what the error was and I checked to make sure that that is the error. I don't need help on my syntax I need help in the logic.

snippsat 661 Master Poster · Answer 7 · 2013-02-05T04:48:44+00:00

Ok the only think i can think of,is that the there is a problem with your test.txt.
As you say there is no problem if you writing in one url.

You can test with repr() to see all that test.txt is outputing.
So here google and sol url adress work,and i get <urlopen error no host given> on Python adress because of missing / this is when test run it in my code over.

for site in open('test.txt'):
    print repr(site)

"""Output--> repr
'http://www.google.com/videohp\n'
'http:/www.python.org/\n'
'http://www.sol.no/\n'
'\n'
"""

"""Output--> from code over
http://www.google.com/videohp
<urlopen error no host given>
http://www.sol.no/
unknown url type:
"""

URLError when reading from txt file

Recommended Answers Collapse Answers

All 10 Replies

Recommended Answers