So I want to extract the text between alpha and end and then bravo and end. I have quite a few of these unique words in my file so I have a list and a counter to go through them. See the code below:
string = 'alpha 111 bravo 222 alpha somethingA end, 333 bravo somethingB end 444 alpha 555 bravo'
words = ['alpha', 'bravo'] #there will be more words here
counter = 0
stringOut = ''
#going through the list of words
while counter < len(words):
firstWord = words[counter]
lastWord = 'end'
data = string[string.find(firstWord)+len(firstWord):string.find(lastWord)].strip()
#this will give the text between the first ocurrance of "alpha" and "end"
#since I want just the smallest string between "alpha" and "end", I use another while loop
#to see if firstWord occurs again
while firstWord in data:
ignore,ignore2,data = data.partition(str(firstWord))
counter = counter + 1
stringOut += str(data) + str('\n')
print('output string is \n' + str(stringOut))
#this code gives the correct output for the text between the first word ("alpha") and "end".
#but when the list moves to the next string "bravo", it takes the text between the first "bravo"
#and the "end" that was associated with the information required for "alpha" ("somethingA")
Can anyone help me with this please? Any suggestions are welcome.
This code would fail with words like 'bend' or 'send' inside the data, but could give you idea.
t = "alpha 111 bravo 222 alpha somethingA end, 333 bravo somethingB end 444 alpha 555 bravo"
keywords = 'alpha', 'bravo'
if 'end' in t:
for part in t.split('end')[:-1]:
last, key = max((part.rfind(key)+len(key), key) for key in keywords)