Hello everyone, ok so I am working on this code that takes a string and splits it using the re module. The string contains words in between "<" and ">" (sometimes multiple words). Now what the program does is it splits the string using the two characters above. Then it returns a list with the characters that were split along with the rest of the text in the string. After that there is a for loop, the purpose of the loop is to take the the multiple words that were in between the above characters and split them using whitespace.

Here is my code:

def test():
	a = "<this is> <a> test"
	b = re.split("(<|>)", a)
	for item in b:
		c = item.split(' ')
		print c,

#Output: [''] ['<'] ['this', 'is'] ['>'] ['', ''] ['<'] ['a'] ['>'] ['', 'test']

As you can see I get the above output however what I want to get is just one list as an output.

Like this:

['', '<', 'this', 'is', '>', '', '', '<', 'a', '>', '', 'test']

Thanks in advance.

Recommended Answers

All 2 Replies

I'm thinking you should just be "adding" the lists together instead of appending a list onto a list... if you know what I mean.

Like this:

import re

def test():
    a = "<this is> <a> test"
    b = re.split("(<|>)", a)
    c = []
    for item in b:
        c += item.split()
    print(c)

if __name__ == '__main__':
    test()

""" My Output:
['<', 'this', 'is', '>', '<', 'a', '>', 'test']
"""

Is that what you're looking for?

Yea that fixed my problem. Thanks!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.