0

I have a txt file as an input.
It is a list which looks like this:

A12345
B153875
C34893
...
...

and I have a database file which looks like this:

A12345 detail information
nvonafwenfovosdncsjdnfoewhuwerhwieufhiudhfisdfnsd
sdofnowerugfeuhgfurhgiuwerhfjdshfiasdhifheruwufhi
irgfiweurgf

A246 detail information
isdofnowerugfeuhgfurhgiuwerhfjdshfiadhifheruwufhi
wgerjgneiguihuhdnvkjdnvkjbdegiauberiubgieubgridfb
ooogrngoawerngiauengugbuivrug

B153875 detail information
wgerjgneiguihuvkwwjddnvkegtiaugberijubgieubgridfb
eragnowergnoweungfiousdhiuhsdnjkfnsk

C34893 detail information
fnweuraiwerbgivjbdbvurgfuwherugtheurhguhweriguhdg
sdgnasoughiueghaiwuh

...
...
...

My goal now is to find all the names listed (A12345_XXX, B153875_XXX, C34893_XXX, ...etc) in the database and create an output file like this (containing the names and the contents):

A12345_XXX
nvonafwenfovosdncsjdnfoewhuwerhwieufhiudhfisdfnsd
sdofnowerugfeuhgfurhgiuwerhfjdshfiasdhifheruwufhi
irgfiweurgf

B153875_XXX
wgerjgneiguihuvkwwjddnvkegtiaugberijubgieubgridfb
eragnowergnoweungfiousdhiuhsdnjkfnsk

C34893_XXX
fnweuraiwerbgivjbdbvurgfuwherugtheurhguhweriguhdg
sdgnasoughiueghaiwuh
...
...

How should I approach this?
(Fortunately, both the namelist and the database are in alphabetical order.)

My code so far only cover the filehandle part, something like this:

($v1, $v2, $v3) = @ARGV;
//$v1 is the namelist file
//$v2 is the database filename
//$v3 is the desired output filename

open (FILEHANDLE, $v1) || die;
open (DATABASE, $v2) || die;
open (RESULTS, ">$v2");

......
......
......

close (FILEHANDLE);
close (DATABASE);
close (RESULTS);
exit;

Request help!

Edited by jacquelinek: n/a

2
Contributors
1
Reply
2
Views
8 Years
Discussion Span
Last Post by d5e5
0

This may work for you. From looking at the files as they appear in your post (without code tags) it's hard to know if there are supposed to be spaces, carriage returns, or line-feed characters separating the records, or whether they are fixed or variable length.

I attached the input files used to test the following script, plus the resulting output file.

#!/usr/bin/perl -w
#RegExSlurp.pl for jacquelinek
use strict;

my ($v1, $v2, $v3) = @ARGV;
#$v1 is the namelist file
#$v2 is the database filename
#$v3 is the desired output filename
open (FILEHANDLE, $v1) || die;
open (DATABASE, $v2) || die;
open (RESULTS, ">$v3");
my @namelist = <FILEHANDLE>; #Read entire namelist file into an array;

my $save = $/; #To restore after undef
undef $/; #Enter "file-slurp mode"
my $db = <DATABASE>; #Read entire DATABASE into $db string
$/ = $save; #Restore default record separator
print "\n";
foreach my $i (@namelist) {
    chomp($i);
    if ($db =~ /^($i) detail information\s*([a-z\r\n\s]+)/m){
        #print "Name is: $1\n details are: $2\n";
        print RESULTS "$1_XXX\n$2";
    }
    else {
        print RESULTS "$i not found in database\n";
    }
}

close (FILEHANDLE);
close (DATABASE);
close (RESULTS);
exit;
Attachments
A12345 detail information
nvonafwenfovosdncsjdnfoewhuwerhwieufhiudhfisdfnsd
sdofnowerugfeuhgfurhgiuwerhfjdshfiasdhifheruwufhi
irgfiweurgf

A246 detail information
isdofnowerugfeuhgfurhgiuwerhfjdshfiadhifheruwufhi
wgerjgneiguihuhdnvkjdnvkjbdegiauberiubgieubgridfb
ooogrngoawerngiauengugbuivrug

B153875 detail information
wgerjgneiguihuvkwwjddnvkegtiaugberijubgieubgridfb
eragnowergnoweungfiousdhiuhsdnjkfnsk

C34893 detail information
fnweuraiwerbgivjbdbvurgfuwherugtheurhguhweriguhdg
sdgnasoughiueghaiwuh
A12345
B153875
C34893
A12345_XXX
nvonafwenfovosdncsjdnfoewhuwerhwieufhiudhfisdfnsd
sdofnowerugfeuhgfurhgiuwerhfjdshfiasdhifheruwufhi
irgfiweurgf

B153875_XXX
wgerjgneiguihuvkwwjddnvkegtiaugberijubgieubgridfb
eragnowergnoweungfiousdhiuhsdnjkfnsk

C34893_XXX
fnweuraiwerbgivjbdbvurgfuwherugtheurhguhweriguhdg
sdgnasoughiueghaiwuh
This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.