0

Hi, I have a file which contains lines in a unformatted way, from that file I would like to take the value of ID=xxxx which can be in the beginning of the line or at the end or in the middle sometimes inside the brackets and repeated multiple times in the same line (the value will be the same).

xxxxx(xxxxx);xxx=xxxx,ID=1234-xxxxxxx
(ID=4321),xxxxxxx/xxxxxxx-xxxxxxxxxxx
(ID=3802))(xxxxxx=(xxxxxx=xxx)(xxxxx=xxxxxxx)(ID=3802)))

I am not sure how to get the value for the ID.

Please advise!

3
Contributors
3
Replies
4
Views
7 Years
Discussion Span
Last Post by rmsagar
0

Like this?

f_in = open('ids.txt')

for line in f_in.readlines():
    print line
    pos = 0
    for run in range(line.count('ID=')):
        pos = line.index('ID=', pos) + 3
        print line[pos:pos + 4]

Cheers and Happy coding

0

if the ID always is a number without -
If not change the regex.

import re

text = '''\
xxxxx(xxxxx);xxx=xxxx,ID=1234-xxxxxxx
(ID=4321),xxxxxxx/xxxxxxx-xxxxxxxxxxx
(ID=3802))(xxxxxx=(xxxxxx=xxx)(xxxxx=xxxxxxx)(ID=3802)))
'''

out_match = re.findall(r'ID=\d+', text)
print out_match
#-->['ID=1234', 'ID=4321', 'ID=3802', 'ID=3802']
This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.