0

I have the following perl script which works nicely. It takes a list of URLs from a file, looks up each one, and returns the meta-keywords for that domain. Problem is, sometimes it hangs because it goes to a site which does not respond. I'm wondering if there is a method by which if it times out, or gets an error response, or in the alternative, a success response, it will know to continue on to the next line in the file. Here is the current script which works, but, as I said, hangs upon grabbing meta-keywords from a bad site:

#!/usr/bin/perl 
#print "Content-type: text/html\n\n"; 
use LWP::Simple; 
use HTML::HeadParser; 
open (OUTFILE, '>outfile.txt'); 
open (MYFILE, 'url3.txt'); 
foreach $line (<MYFILE>) { 
chomp($line); 
$URL = get($line); 
$Head = HTML::HeadParser->new; 
$Head->parse("$URL");  
print OUTFILE $Head->header('X-Meta-Description') . "."; 
} 
close(MYFILE); 
close(OUTFILE); 
exit;

Edited by bulgin: perl, meta, time out

2
Contributors
3
Replies
4
Views
6 Years
Discussion Span
Last Post by mitchems
0

Set the timeout for LWP Simple:

use LWP::Simple qw($ua get);
$ua->timeout(30); #this sets the timeout to 30 secs. The default is 180.
my $URL = get $line || die "Timed out!"; #you don't have to die, really it's ok
0

mitchems, thank you for your reply.

Will this work such that the script to proceed to the next URL if the time out exceeds the limit and it "dies"?

0

Yes. Take out the "die" part and just have it timeout. It should move on to the next URL because you're using the same user-agent handle.

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.