Hi,

I have a string like this:

$string="Lake City 84132, USA. FAU - Carrell, D T AU - Carrell DT FAU - Emery, B R AU - Emery BR";

I want to extract AU and only its contents. The output should be like this:

AU - Carrell DT
AU - Emery BR

I tried like this:

if($string=~/\bAU\b(.*)/)
{

}

But other contents are coming with that again i have to separately write a regular expression.

How can i get the desired output??

My output should be (AU - Carrell DT AU - Emery BR)

Any suggestions ???

Regards
Vandhita

Hi,
Try this..

my $string = 'Lake City 84132, USA. FAU - Carrell, D T AU - Carrell DT FAU - Emery, B R AU - Emery BR
';
foreach my $sub_str(split(/, /, $string)){
       print $1,"\n" if($v =~ /( AU - (\w+ \w+)?)/);
}

katharnakh.

Hi,
Try this..

my $string = 'Lake City 84132, USA. FAU - Carrell, D T AU - Carrell DT FAU - Emery, B R AU - Emery BR
';
foreach my $sub_str(split(/, /, $string)){
       print $1,"\n" if($v =~ /( AU - (\w+ \w+)?)/);
}

katharnakh.

Hi,

Thanks for the reply!!

But for some cases the AU information is not picked up!!

Example in this case.

$string="Human Genetics Research Group, IDIBAPS, University of Barcelona, Barcelona, Spain. FAU - Torregrosa, Nuria AU - Torregrosa N FAU - Dominguez-Fandos, David AU - Dominguez-Fandos D FAU - Camejo, Maria Isabel AU - Camejo MI FAU - Shirley, Cynthia R AU - Shirley CR FAU - Meistrich, Marvin L AU - Meistrich ML FAU - Ballesca, Jose Luis AU - Ballesca JL';

The output i got was this:

AU - Torregrosa N
 AU -
 AU - Camejo MI
 AU - Shirley CR
 AU - Meistrich ML
 AU - Ballesca JL

The AU - Dominguez-Fandos D is not displayed.

How to handle such cases like this???

The correct output should be like this:

AU - Torregrosa N
 AU -Dominguez-Fandos D
 AU - Camejo MI
 AU - Shirley CR
 AU - Meistrich ML
 AU - Ballesca JL

How can i get that??

Regards
Vandhita

Hi,
Sorry in the first code i posted there was an error, $v in if statement should be $sub_str.
You have to understand the pattern of the string, little try would have got you what you want. Below is the code

foreach $sub_str(split(/, /, $string)){
    print $1,"\n" if($sub_str =~ /( AU - ([\w\S]+ \w+)?)/);
}

You might want to look at perldoc perlrequick in your command line. Look for abbreviations for common character classes.

katharnakh.

This article has been dead for over six months. Start a new discussion instead.