Hi, I have a file which contains lines in a unformatted way, from that file I would like to take the value of ID=xxxx which can be in the beginning of the line or at the end or in the middle sometimes inside the brackets and repeated multiple times in the same line (the value will be the same).

xxxxx(xxxxx);xxx=xxxx,ID=1234-xxxxxxx
(ID=4321),xxxxxxx/xxxxxxx-xxxxxxxxxxx
(ID=3802))(xxxxxx=(xxxxxx=xxx)(xxxxx=xxxxxxx)(ID=3802)))

I am not sure how to get the value for the ID.

Please advise!

Like this?

f_in = open('ids.txt')

for line in f_in.readlines():
    print line
    pos = 0
    for run in range(line.count('ID=')):
        pos = line.index('ID=', pos) + 3
        print line[pos:pos + 4]

Cheers and Happy coding

if the ID always is a number without -
If not change the regex.

import re

text = '''\
xxxxx(xxxxx);xxx=xxxx,ID=1234-xxxxxxx
(ID=4321),xxxxxxx/xxxxxxx-xxxxxxxxxxx
(ID=3802))(xxxxxx=(xxxxxx=xxx)(xxxxx=xxxxxxx)(ID=3802)))
'''

out_match = re.findall(r'ID=\d+', text)
print out_match
#-->['ID=1234', 'ID=4321', 'ID=3802', 'ID=3802']
This question has already been answered. Start a new discussion instead.