Hello,

I am using the RE, Regular Expression Module. The operation I need to perform is a search; The syntax for which goes like this

re.search(pattern, string, flags) #flags being optional

where 'pattern' is the string I am looking for in 'string'. My problem is I want to look for pattern and subpatterns in 'string'.
for example. if the pattern I am searching for is

"abcd_asdf" and the string in which I am searching is " aaslkdfn_abcd_asdf_asgd"

I also want to search for "abcd" and "asdf" and "_" individually. The way i understand it is if I explicitly mention each pattern only then I am able to accomplish the task. But if the pattern is specified like

m = "abcd_asdf" then I am not able to do the above mentioned task.

I hope I am not confusing neone. just to clarify a bit more here is the list of patterns and strings. They are in CSV format and I am using the CSV module as well.

Any help will be appreciated.

Thank you

actual PATTERNS
------------
ACT EVDO
Sand Trap EVDO
Dump EVDO
Sunset EVDO
South Valley EVDO

Actual STRINGS
--------------------
NM4_ACT
NM4_SAND
NM4_DUMP
NM4_SUNSET
NM4_SOUTH_VALLEY

Recommended Answers

All 3 Replies

You can identify substrings using the () characters within your pattern; for example:

>>> import re, os, sys
>>> tests = [ 'NM4_ACT', 'NM4_SAND', 'NM3_BOO', 'NM5_FOOBAR', 'NM4_DUMP', 'NM4_SUNSET', 'NM4_SOUTH_VALLEY' ]
>>> for each in tests:
...     ret = re.search('^NM4_(.*)$', each)
...     if ret:
...         ret.groups()
...     
('ACT',)
('SAND',)
('DUMP',)
('SUNSET',)
('SOUTH_VALLEY',)
>>>

As you can see, by using the groups() or group() functions you can extract the substrings that were identified in your pattern after a match is found. HTH

In the example you gave, do you want a result from 'NM4_SAND' using the 'Sand Trap EVDO' pattern as an example?

If so you need to end up with a command like:

if re.search(r"SAND|TRAP|EVDO", subject):
	# Successful match
else:
	# Match attempt failed

So you should be able to iterate through your pattern list, making them uppercase and replacing spaces with vertical rules (to specify 'OR' in the RE).
Then you could use each pattern and iterate through your subject list to find matches.

"abcd_asdf" and the string in which I am searching is " aaslkdfn_abcd_asdf_asgd"

I also want to search for "abcd" and "asdf" and "_" individually.

Here's another example:

>>> test1 = 'aaslkdfn_abcd_asdf_asgd'
>>> ret = re.search('.*_(abcd)(_)(asdf)_.*', test1)
>>> ret.group(0)
'aaslkdfn_abcd_asdf_asgd'
>>> ret.group(1)
'abcd'
>>> ret.group(2)
'_'
>>> ret.group(3)
'asdf'
>>> ret.groups()
('abcd', '_', 'asdf')
>>>
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.