954,510 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Getting the start / end of string in regex through match objects

Hi everybody,
I want to get the start and end of all the patterns mattched in regex. I know I can get it with start() and end() fn of matched objects. But re.search() return the match object of first matching regex in the string. I want all match objects in that string

Here is the string :

tmplstr = """
${name}

${list: parentlst}
an element ${elem: parentlst}
${/list: parentlst}

${list: childlst}
an element ${elem: childlst}
${/list: childlst}
"""

Here is the regex script:

# Compile List Patterns
# Start of List
lstpattern_st = r"(\$\{list: ([a-z]*[0-9 ]*)\})"
lstpat_st = re.compile(lstpattern_st)

# End of List
lstpattern_end = r"(\$\{/list: ([a-z]*[0-9 ]*)\})"
lstpat_e = re.compile(lstpattern_end, re.I)


matchgrp_st = lstpat_st.search(tmplstr)
strt = matchgrp_st.start()
print strt
matchgrp_e = lstpat_e.search(tmplstr)
end = matchgrp_e.end()
print end
print self.tmplstr[strt:end]


Note: There are no spaces in $list after colon. I had given it to avoid smilies only

I want all the start and end indices of the string but re.search() returns the first regex met in the string. re.match() also wont work because it search in the begining.

Can anyone help me in getting the start and end indices of all. OR can provide any other solution instead of this

ankit_rastogi82
Newbie Poster
7 posts since Aug 2005
Reputation Points: 14
Solved Threads: 0
 

Hmm, got some goofy funny faces in your code and reworked it a little. Is this still correct?

"""
I want all the start and end indices of the string but re.search() returns the first
regex met in the string. re.match() also wont work because it search in the begining.

Can anyone help me in getting the start and end indices of all. OR can provide any
other solution instead of this
"""

tmplstr = """
${name}

${listparentlst}
an element ${elemparentlst}
${/listparentlst}

${list:childlst}
an element ${elem:childlst}
${/list:childlst}
"""

import re

# Compile List Patterns
# Start of List
lstpattern_st = r"(\$\{list([a-z]*[0-9 ]*)\})"  # had unbalanced ()
lstpat_st = re.compile(lstpattern_st)

# End of List
lstpattern_end = r"(\$\{/list([a-z]*[0-9 ]*)\})"
lstpat_e = re.compile(lstpattern_end, re.I)


matchgrp_st = lstpat_st.search(tmplstr)
strt = matchgrp_st.start()
print 'start =', strt
matchgrp_e = lstpat_e.search(tmplstr)
end = matchgrp_e.end()
print 'end =', end
print 'tmplstr[%d:%d] =' % (strt, end)
print tmplstr[strt:end]  # removed self.
vegaseat
DaniWeb's Hypocrite
Moderator
5,989 posts since Oct 2004
Reputation Points: 1,345
Solved Threads: 1,417
 

Hi vega,
for getting all the match objects in the string. I had found out finditer() and it gave me the result. Thanks for replying. Here is the script snippet :

startpattern = re.compile(r"(\$\{list:   ([a-z]*[0-9 ]*)\})")

        # Get All the match objects of ${list: listname} in template
        startgrps = startpattern.finditer(self.tmplstr)

        # Store the end/start indices of all start list Placeholders: ${list: listname}
        endindexofStart = []
        startindexofStart = []
        for grp in startgrps:
            startindexofStart.append(grp.start())
            endindexofStart.append(grp.end())
            pass
ankit_rastogi82
Newbie Poster
7 posts since Aug 2005
Reputation Points: 14
Solved Threads: 0
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You