| | |
Extract data from a saved file
Please support our Perl advertiser: Programming Forums - DaniWeb Sister Site
Thread Solved |
Hey,
I have used the LWP:\
imple module and saved the source of a website in a file. I am trying to extract all the data between the <head> tags and pass it to a variable to process.
So far I can't seem to extract the data properly. Any suggestions?
Thanks
I have used the LWP:\
imple module and saved the source of a website in a file. I am trying to extract all the data between the <head> tags and pass it to a variable to process.So far I can't seem to extract the data properly. Any suggestions?
Perl Syntax (Toggle Plain Text)
my $data = getstore("http://www.google.com/", "website.txt"); unless(is_success($data)){ die "Could not retrive website: $data"; } open(PAGE, "website.txt") or die "$!"; my @info = <PAGE>; close(PAGE); my @meta; my $i = 0; my $stuff; foreach $stuff(@info){ $meta[$i] = ($stuff =~ m/<head(.*?)</head>/); $i++; } $i = 0; foreach $_ (@meta){ #print $meta[$i]; print $_; }
Thanks
Last edited by kenji; May 13th, 2009 at 4:07 pm.
And she said "Let there be light" and on the seveth day Windows booted.
And the crowds screamed in terror and cowered in fear for Microsoft had approached.
From the testament of 10011101
And the crowds screamed in terror and cowered in fear for Microsoft had approached.
From the testament of 10011101
UPDATE:
I managed to get the data into one string and now I am trying to match with a regular expression.
I am having trouble with the regular expression.
$s is the scalar with the whole string, I want to extract the head tag s from $s and assign to $d.
I managed to get the data into one string and now I am trying to match with a regular expression.
I am having trouble with the regular expression.
Perl Syntax (Toggle Plain Text)
my $d = ($s =~ m/<head>(.*)<\/head>/);
$s is the scalar with the whole string, I want to extract the head tag s from $s and assign to $d.
And she said "Let there be light" and on the seveth day Windows booted.
And the crowds screamed in terror and cowered in fear for Microsoft had approached.
From the testament of 10011101
And the crowds screamed in terror and cowered in fear for Microsoft had approached.
From the testament of 10011101
Thanks that worked great.
What exactly does /is do? Also one more question if I try to extract the meta tags and place each individually in array will it work? Assuming that their maybe 1 or meta tags inside.
Something like this:
my (@m) = $s =~ m/<meta (.*?)>/is;
What exactly does /is do? Also one more question if I try to extract the meta tags and place each individually in array will it work? Assuming that their maybe 1 or meta tags inside.
Something like this:
my (@m) = $s =~ m/<meta (.*?)>/is;
Last edited by kenji; May 13th, 2009 at 8:30 pm.
And she said "Let there be light" and on the seveth day Windows booted.
And the crowds screamed in terror and cowered in fear for Microsoft had approached.
From the testament of 10011101
And the crowds screamed in terror and cowered in fear for Microsoft had approached.
From the testament of 10011101
This is what I came up with, but it seems to be repeating the first match rather than check for the next meta tag.
Perl Syntax (Toggle Plain Text)
my (@m) = $d =~ m/(<meta (.*?)>){1,5}/is;
And she said "Let there be light" and on the seveth day Windows booted.
And the crowds screamed in terror and cowered in fear for Microsoft had approached.
From the testament of 10011101
And the crowds screamed in terror and cowered in fear for Microsoft had approached.
From the testament of 10011101
my (@m) = $s =~ m/<meta (.*?)>/gis;
You can look up the regexp modifiers in any regexp tutorial.
i - case insentive matching
s - match as a single string so matches across newlines
g - global match, works like grep, finds all matches in a string/line
You can look up the regexp modifiers in any regexp tutorial.
i - case insentive matching
s - match as a single string so matches across newlines
g - global match, works like grep, finds all matches in a string/line
Last edited by KevinADC; May 14th, 2009 at 6:02 am.
![]() |
Similar Threads
- How to extract data from HTML file using C#.NET 2.0 (C#)
- Finding previously entered data from a text file (C++)
- Using VBScript to extract data from CSV (Visual Basic 4 / 5 / 6)
- Get data out of excel file stored as an image (MS SQL)
- Save UDP data to file (Visual Basic 4 / 5 / 6)
- getting data from a text file and putting it in an excel file using visual basic 6.0 (Visual Basic 4 / 5 / 6)
- how to extract data from javascript into php (PHP)
- how can i extract data from .xml file? (Visual Basic 4 / 5 / 6)
Other Threads in the Perl Forum
- Previous Thread: "Use of uninitialized value" for database variable
- Next Thread: concatenation question
| Thread Tools | Search this Thread |





