We're a community of 1076K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,075,595 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

help with perl

I am new in Perl and generally in programming. I face some real problems here. I need a Perl script that can open a text file, read a series of URLs, get the content of the pages and save it to another file.

Thanks deeply for any guidance.

3
Contributors
4
Replies
20 Hours
Discussion Span
11 Months Ago
Last Updated
5
Views
Question
Answered
tonyprotop
Newbie Poster
3 posts since May 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0

I face some real problems here. I need a Perl script that can open a text file, read a series of URLs, get the content of the pages and save it to another file.

Divide your task into steps, such as:

  1. open a text file
  2. read a series of URLs
  3. get the content of the pages for each URL
  4. save contents of pages to another file

Which of these steps gives you problems?

Could you attach an example of the text file you need to read? (Click on 'Files' in the Post menu here to upload a text file.)

d5e5
Practically a Posting Shark
831 posts since Sep 2009
Reputation Points: 162
Solved Threads: 163
Skill Endorsements: 1

all of them i guess. i have tried several times but i canot combine the code i write.... any help will be much more than valuable....
i tried to upload a file here, but that txt files are not allowed.

The content of the file is this:
http://www.bbc.co.uk/news/uk-13506898 http://news.bbc.co.uk/2/hi/7782422.stm

tonyprotop
Newbie Poster
3 posts since May 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0

Hi tonyprotop,

The script below can do what you want, but there are some questions like:
Is your output file also a text file or a html file, so that all the urls contents are saved and viewd as html file also. OR are you trying to parse the contents of all the urls?

What my script does is to open a file called urls.txt [you can call yours any name you like ] and read all the URLs on that text file, then I used the function get(), from module LWP::Simple [ Incase you don't have it, you might have to install it from CPAN ] to get the web pages url and return it contents, which was then written to another file called output.txt [ you can change that to .html if you like ].
Eventually, all the contents of the urls will be written to this output file and if you have it as HTML [i.e output.html] you will be able to see all the pages in a single html file.

#!/usr/bin/perl
use warnings;
use strict;
use LWP::Simple;

my $input_file = 'urls.txt';    # All my URLs are this text file
open my $fh2, '>:encoding(UTF-8)', 'output.txt'
  or die "can't open file: $!";    # file to write to
open my $fh, '<:encoding(UTF-8)', $input_file
  or die "can't open $input_file: $!";    # file to read from
binmode $fh;
while (<$fh>) {
    chomp;
    my $file = get($_) or die "can't get the webpage: $!";
    print {$fh2} $file;
}
close $fh  or die "can't close file: $!";
close $fh2 or die "can't close file: $!";

Hope this helps

2teez
Junior Poster
161 posts since Apr 2012
Reputation Points: 40
Solved Threads: 32
Skill Endorsements: 0

it's perfect! Thank you! Thank you! You are the best!

tonyprotop
Newbie Poster
3 posts since May 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0
Question Answered as of 11 Months Ago by d5e5 and 2teez

This question has already been solved: Start a new discussion instead

Post: Markdown Syntax: Formatting Help
 
You
 
© 2013 DaniWeb® LLC
Page rendered in 0.3007 seconds using 2.69MB