Hello everyone. I've run into a problem with regular expressions; the extraction "variables" ($1, $2, $3 etc.) are read only and scoped to the current block. If you need to do two regex extraction operations in the same block, is there a way to reset the ($1, $2, $3 etc.) so they can be reused? Any help appriciated.

Steven.

what do you mean by block? subroutine scope or method scope?

if you do one regex then another within a subroutine, you don't need to reset $1,$2,$3 etc, the next regex will overwrite the old values, if you want to preserve them, make a named copy of each one: my($match1) = $1; etc...

if you mean method scope, specifically the scope when you treat a regular expression's replace expression as an expression (i.e. s|(whatever)|do_funct($1)|eg); the "$1" isn't really needed after the method do_funct is called, because the value is passed into the method (and the format of that expression means $1 is likely to be replaced frequently), but even if it was needed afterwards, the do_funct() subroutine won't affect the value of $1 even if a regular expression is called within do_funct(). this is because the $1,$2,$3 variables are all "localized" to the subroutine that sets them. basically, perl automatically saves the value of global localized variables before exiting and leaving subroutines via subroutine calls. when those called subroutines return, the values are restored so that the calling subroutine can keep using them. it's a strange system! i've only ever had one use for making localized variables.

#!/usr/bin/perl
$in = "Hello, World...\n";
$in =~ s|(Hello)|get($1)|e;
print "after sub: ".$1."\n";
sub get{
  my($t) = @_[0];
  $t =~ m|(H)|;
  print "in sub: ".$1."\n";
}
 
[B]OUTPUT:[/B]
[I]in sub: H[/I]
[I]after sub: Hello[/I]

or, to better show the localization:

#!/usr/bin/perl
$in = "Hello, World...\n";
$in =~ s|(Hello).*(World)|get($1)|e;
print "after sub: (\$1) ".$1."\n";
print "after sub: (\$2) ".$2."\n";
sub get{
  my($t) = $_[0];
  print "in sub: (\$2) ".$2."\n";
  $t =~ m|(H)|;
  print "in sub: (\$1) ".$1."\n";
  print "in sub: (\$2) ".$2."\n";
}
[B]OUTPUT:[/B]
[I]in sub ($2): World[/I]
[I]in sub ($1): H[/I]
[I]in sub ($2): [/I]
[I]after sub ($1): Hell[/I][I]o[/I]
[I]after sub ($2): World[/I]

notice the double output for $2 in the get subroutine; the first output is deliberately before the regex is called... if the $2 is used after, it's empty/overwritten

hope that helps somewhat o_O

Hello everyone. I've run into a problem with regular expressions; the extraction "variables" ($1, $2, $3 etc.) are read only and scoped to the current block. If you need to do two regex extraction operations in the same block, is there a way to reset the ($1, $2, $3 etc.) so they can be reused? Any help appriciated.

Steven.

It sounds like maybe you are using a loop to process more then one line of data in the block. Post the block of code in question.

Actually (I should have said this at the start), the regexes were inside an eval so I'm not surprised that strange things were happening. At the time it appeared that the matches extracted from one regex were not being overwritten by those from the next match. However I could be mistaken about this. Anyway, I've found a better way of doing what I was trying to and there's not an eval in sight. It works as well :p . Thanks for the advice.

Steven.

This article has been dead for over six months. Start a new discussion instead.