Image crawler

Question

Stefano Mtangoo 455 Senior Poster

15 Years Ago

Hi guys,
I have been away Python for long now.
I need to brush with small project that will be downloading images from give url.
I give url and it crawls through all pages in give location and its subfolders and download image. Now two challenges:
1. Crawl through all pages in Given url (Folder and sub folders)
2. Download found Images (urllib2?)
3. Sites need authentication, how do I do?

Please help me point right Direction and I missed you ;)

image python

3 Contributors
3 Replies
199 Views
10 Hours Discussion Span
Latest Post 15 Years Ago Latest Post by Tech B

All 3 Replies

snippsat 661 Master Poster

15 Years Ago

First you can start to dowload a picture from web.

2. Download found Images (urllib2?)

Yes can use urllib2.

I have been thinking off makeing a Image crawler for a while now and maybe with a gui frontend(wxpython)
Just havent getting startet yet.

Here some code you can look at,download a random picture i found on net and save it to disk.

from urllib2 import urlopen

def download_from_web(url,outfile):
    '''
    Give url adress to source you want to download
    Name of fileformat example <somthing.jpg>
    '''    
    try:
        webFile = urlopen(url)
        localFile = open(outfile, 'wb')
        localFile.write(webFile.read())
        webFile.close()
        localFile.close()
    except IOError, e:
        print "Download error"
        
def main():
    #Just a random picture
    download_from_web('http://www.opticianonline.net/blogs/big-optometry-blog/Optyl.jpg','myfile.jpg')        
        
if __name__ == "__main__":
    main()

Edited 15 Years Ago by snippsat because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Stefano Mtangoo 455 Senior Poster · Answer 1 · 2010-05-20T22:12:19+00:00

Thanks for snippets.
Do you have idea on how to get files on the web server with path and detect images in them and download them? I'm thinking but not yet got "how-to" ::
Thanks

Tech B 48 Posting Whiz in Training · Answer 2 · 2010-05-21T00:13:37+00:00

You could open the page via urllib2 read the source, and look for image extensions.
Regex and plan old slicing could do it.

You could also parse for image tags. The html parse lib that came with the standard library could help.

Image crawler

Recommended Answers Collapse Answers

All 3 Replies

Recommended Answers