I hear perl is the way to go for string parsing, so here is the test!

I have a file like this:

...
<li><a href="DSC_9866.JPG"> DSC_9866.JPG</a></li>
<li><a href="DSC_9867.JPG"> DSC_9867.JPG</a></li>
...

and I want to get a list of the file names. That is, the result I want is a list of strings

DSC_9866.JPG, DSC_9867.JPG, etc

How would I go about this?

Thanks,

Dave

Recommended Answers

All 4 Replies

Shouldn't this be in Perl forum?

Yep, I messed up. I have already flagged the post as "bad" and am just awaiting a mod to move it.

Sorry!

For really simple html you can use regular expressions. For more complex data it would be better to search CPAN for a good html parser module and learn how to use it (which I haven't got around to doing yet). Meanwhile the following script should do what you want.

#!/usr/bin/perl
#ParseList.pl
use strict;
use warnings;
open my $fh, '<', '/home/david/Programming/Perl/data/list.txt';
my @list;
while (<$fh>){
    m/href="(\w+\.\w+)"/;
    push @list, $1;
}
print "Here is the list:\n";
print join(", ", @list);

Haha perl always look so scary... thanks though, I'll give it a shot.

Dave

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.