943,724 Members | Top Members by Rank

Ad:
  • Perl Discussion Thread
  • Marked Solved
  • Views: 711
  • Perl RSS
Aug 1st, 2009
0

script cancels with message but no illegal operation

Expand Post »
The following code:
Exits immediately and produces the message below when search pattern "\b\w+{3}\b" is entered:

perl Syntax (Toggle Plain Text)
  1. #!/usr/bin/perl -w
  2. #matchtest1.pl
  3. use strict;
  4.  
  5. my($pattern);
  6. my($true);
  7. $_ = '1: A silly sentence (495,a) *BUT* one which will be useful. (3)';
  8.  
  9. do {
  10. print "Enter a regular expression: ";
  11. chomp($pattern = <STDIN>);
  12.  
  13. if (/($pattern)/g){ # Search pattern must be enclosed in ()'s
  14. print "$pattern found in $_\n";
  15. print "\$& = $&\n";
  16. print "\$1 is '$1'\n" if defined $1;
  17. print "\$2 is '$2'\n" if defined $2;
  18. print "\$3 is '$3'\n" if defined $3;
  19. print "\$4 is '$4'\n" if defined $4;
  20. print "\$5 is '$5'\n" if defined $5;
  21. }
  22. else{
  23. print "$pattern NOT found\n";
  24. }
  25. print "Enter 1 to continue, 0 to exit -> "; chomp($true=<STDIN>);
  26. }while($true)

(Yes there is a nested quantifier in the search pattern "\b\w+{3}\b", but why does this cause the code to abort?)

Nested quantifiers in regex; marked by <-- HERE in m/(\b\w+{ <-- HERE 3}\b)/ at matchtest1.pl line 13, <STDIN> line 1.


Bonus Question!!!
Although the global modifier is used (/($pattern)/g) patterns with multiple matches only return the first match $1. Why?
Last edited by perlfan; Aug 1st, 2009 at 5:58 pm.
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
perlfan is offline Offline
7 posts
since Jul 2009
Aug 1st, 2009
0

Re: script cancels with message but no illegal operation

It aborts because you can't have nested quantifiers.

Your expectation of what the "g" modifier does is also wrong. Here is an excerpt from perlretut:

Global matching

The final two modifiers //g and //c concern multiple matches. The modifier //g stands for global matching and allows the matching operator to match within a string as many times as possible. In scalar context, successive invocations against a string will have `//g jump from match to match, keeping track of position in the string as it goes along. You can get or set the position with the pos() function.

The use of //g is shown in the following example. Suppose we have a string that consists of words separated by spaces. If we know how many words there are in advance, we could extract the words using groupings:

1. $x = "cat dog house"; # 3 words
2. $x =~ /^\s*(\w+)\s+(\w+)\s+(\w+)\s*$/; # matches,
3. # $1 = 'cat'
4. # $2 = 'dog'
5. # $3 = 'house'

But what if we had an indeterminate number of words? This is the sort of task //g was made for. To extract all words, form the simple regexp (\w+) and loop over all matches with /(\w+)/g :

1. while ($x =~ /(\w+)/g) {
2. print "Word is $1, ends at position ", pos $x, "\n";
3. }

prints

1. Word is cat, ends at position 3
2. Word is dog, ends at position 7
3. Word is house, ends at position 13

A failed match or changing the target string resets the position. If you don't want the position reset after failure to match, add the //c , as in /regexp/gc . The current position in the string is associated with the string, not the regexp. This means that different strings have different positions and their respective positions can be set or read independently.

In list context, //g returns a list of matched groupings, or if there are no groupings, a list of matches to the whole regexp. So if we wanted just the words, we could use

1. @words = ($x =~ /(\w+)/g); # matches,
2. # $word[0] = 'cat'
3. # $word[1] = 'dog'
4. # $word[2] = 'house'
Reputation Points: 246
Solved Threads: 67
Practically a Posting Shark
KevinADC is offline Offline
898 posts
since Mar 2006
Aug 2nd, 2009
0

Re: script cancels with message but no illegal operation

Click to Expand / Collapse  Quote originally posted by KevinADC ...
It aborts because you can't have nested quantifiers.

Your expectation of what the "g" modifier does is also wrong. Here is an excerpt from perlretut:

Global matching

The final two modifiers //g and //c concern multiple matches. The modifier //g stands for global matching and allows the matching operator to match within a string as many times as possible. In scalar context, successive invocations against a string will have `//g jump from match to match, keeping track of position in the string as it goes along. You can get or set the position with the pos() function.

The use of //g is shown in the following example. Suppose we have a string that consists of words separated by spaces. If we know how many words there are in advance, we could extract the words using groupings:

1. $x = "cat dog house"; # 3 words
2. $x =~ /^\s*(\w+)\s+(\w+)\s+(\w+)\s*$/; # matches,
3. # $1 = 'cat'
4. # $2 = 'dog'
5. # $3 = 'house'

But what if we had an indeterminate number of words? This is the sort of task //g was made for. To extract all words, form the simple regexp (\w+) and loop over all matches with /(\w+)/g :

1. while ($x =~ /(\w+)/g) {
2. print "Word is $1, ends at position ", pos $x, "\n";
3. }

prints

1. Word is cat, ends at position 3
2. Word is dog, ends at position 7
3. Word is house, ends at position 13

A failed match or changing the target string resets the position. If you don't want the position reset after failure to match, add the //c , as in /regexp/gc . The current position in the string is associated with the string, not the regexp. This means that different strings have different positions and their respective positions can be set or read independently.

In list context, //g returns a list of matched groupings, or if there are no groupings, a list of matches to the whole regexp. So if we wanted just the words, we could use

1. @words = ($x =~ /(\w+)/g); # matches,
2. # $word[0] = 'cat'
3. # $word[1] = 'dog'
4. # $word[2] = 'house'
Thanks so much for your time Kevin.
Looked up the CPAN docs and have figured out where my thinking was taking a left turn. Got it figured out and on track, you really saved me hours of banging my head against the CRT. As usual most problems with code are simple AFTER you see the answer!

Thank You!
Reputation Points: 10
Solved Threads: 0
Newbie Poster
perlfan is offline Offline
7 posts
since Jul 2009

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Perl Forum Timeline: Memory Managment?
Next Thread in Perl Forum Timeline: accurately calculating percentages with perl





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC