944,123 Members | Top Members by Rank

Ad:
  • Perl Discussion Thread
  • Unsolved
  • Views: 2655
  • Perl RSS
Jan 14th, 2006
0

Regexp

Expand Post »
Hi,

I'm currently writing a routine to extract one, two and three word phrases from a string but with two or three word phrases I'm not getting all the phrases. For example

the string "blah blah blah" will show one ocurance of "blah blah" when really there is two.

My code is

while ($str =~ m/(\w+) (\w+)/g)
{

}

Any help would be appricated\

Thanks
Bruce
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
thorne44 is offline Offline
3 posts
since Jan 2006
Jan 15th, 2006
2

Re: Regexp

That's not your code; it doesn't output anything at all.

While doing a global search, Perl continues just after the end of the previous match.
Team Colleague
Reputation Points: 1135
Solved Threads: 172
Super Senior Demiposter
Rashakil Fol is offline Offline
2,479 posts
since Jun 2005
Jan 15th, 2006
0

Re: Regexp

Quote originally posted by Rashakil Fol ...
That's not your code; it doesn't output anything at all.

While doing a global search, Perl continues just after the end of the previous match.
The regexp line is the only line that matters. The code inside just puts $1 and $2 into an array.

The full code is

while ($str =~ m/(\w+) (\w+)/g)
{
$keywords{'I_'.$1.'_'.$2}{'Cnt'} += 2;
$keywords{'I_'.$1.'_'.$2}{'Word'} = "$1 $2";
}

My problem is that perl continues after the end of the last match so with a string of "This is a test" I'll get
"this is" and "a test" but I won't get "is a" even though it's a valiv phrase.

Thanks
Bruce
Reputation Points: 10
Solved Threads: 0
Newbie Poster
thorne44 is offline Offline
3 posts
since Jan 2006
Jan 16th, 2006
3

Re: Regexp

Then don't do it that way. Match only single words, and take the array of single words and work with that. If you're worried about in-between characters, and only want a single space between, do another match for contiguous strings of non-word characters, and then you're set to write some code that ties things together.
Team Colleague
Reputation Points: 1135
Solved Threads: 172
Super Senior Demiposter
Rashakil Fol is offline Offline
2,479 posts
since Jun 2005
Jan 16th, 2006
0

Re: Regexp

Quote originally posted by Rashakil Fol ...
Then don't do it that way. Match only single words, and take the array of single words and work with that. If you're worried about in-between characters, and only want a single space between, do another match for contiguous strings of non-word characters, and then you're set to write some code that ties things together.
Thats what I was going to do but I was hoping that using regexp I could do it neater and faster than using a split array

I'd expect something simple like this could be done easily with regexp. I was hoping there was a flag or something that I didn't know about so it could look at more than the last match.

Oh well back to the hard way :rolleyes:

Thanks
Bruce
Reputation Points: 10
Solved Threads: 0
Newbie Poster
thorne44 is offline Offline
3 posts
since Jan 2006

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Perl Forum Timeline: I get the error no file or directory when i try to run a perl script
Next Thread in Perl Forum Timeline: PDF images





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC