I have the following perl script which works nicely. It takes a list of URLs from a file, looks up each one, and returns the meta-keywords for that domain. Problem is, sometimes it hangs because it goes to a site which does not respond. I'm wondering if there is a method by which if it times out, or gets an error response, or in the alternative, a success response, it will know to continue on to the next line in the file. Here is the current script which works, but, as I said, hangs upon grabbing meta-keywords from a bad site:

#!/usr/bin/perl 
#print "Content-type: text/html\n\n"; 
use LWP::Simple; 
use HTML::HeadParser; 
open (OUTFILE, '>outfile.txt'); 
open (MYFILE, 'url3.txt'); 
foreach $line (<MYFILE>) { 
chomp($line); 
$URL = get($line); 
$Head = HTML::HeadParser->new; 
$Head->parse("$URL");  
print OUTFILE $Head->header('X-Meta-Description') . "."; 
} 
close(MYFILE); 
close(OUTFILE); 
exit;

Recommended Answers

All 3 Replies

Set the timeout for LWP Simple:

use LWP::Simple qw($ua get);
$ua->timeout(30); #this sets the timeout to 30 secs. The default is 180.
my $URL = get $line || die "Timed out!"; #you don't have to die, really it's ok

mitchems, thank you for your reply.

Will this work such that the script to proceed to the next URL if the time out exceeds the limit and it "dies"?

Yes. Take out the "die" part and just have it timeout. It should move on to the next URL because you're using the same user-agent handle.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.