I have string like:
'par1=val1,par2=val2,par3="some text, again some text, again some text",par4="some text",par5=val5'

I have to split it to parts like:
par1=val1
par2=val2
par3="some text, again some text, again some text"
par4="some text"
par5=val5'

I use this code:

a = 'par1=val1,par2=val2,par3="some text1, again some text2, again some text3",par4="some text",par5=val5'.split(',')
newList = []
for i, b in enumerate(a) :
    if b.find('=') != -1 :
        newList.append(b)
    else :
        newList[len(newList)-1] += ',' + b
print(newList)

I'm looking for better solution, can anybody give me it.

Thank you in advance!

Recommended Answers

All 4 Replies

I could not parse it with csv, but if your string looks like python code, you can use python's own tokenizer:

# python 2 and 3
import sys
if sys.version_info < (3,):
    from cStringIO import StringIO
else:
    from io import StringIO
    xrange = range
from tokenize import generate_tokens


a = 'par1=val1,par2=val2,par3="some text1, again some text2, again some text3",par4="some text",par5=val5'

def parts(a):
    """Split a python-tokenizable expression on comma operators"""
    compos = [-1] # compos stores the positions of the relevant commas in the argument string
    compos.extend(t[2][1] for t in generate_tokens(StringIO(a).readline) if t[1] == ',')
    compos.append(len(a))
    return [ a[compos[i]+1:compos[i+1]] for i in xrange(len(compos)-1)]

print(parts(a))

""" my output -->
['par1=val1', 'par2=val2', 'par3="some text1, again some text2, again some text3"', 'par4="some text"', 'par5=val5']
"""

The other alternative is to use regular expressions.

Thank you, Gribouillis. I use your snippet.
Thank you, griswolf too. Your link will solve my other problem ;)

Here is a version with regex. It should work even if the data string contains newlines

# python 2 and 3
import re
regex = re.compile(r"\\.|[\"',]", re.DOTALL)

def parts(data):
    delimiter = ''
    compos = [-1]
    for match in regex.finditer(data):
        g = match.group(0)
        if delimiter == '':
            if g == ',':
                compos.append(match.start())
            elif g in "\"'":
                delimiter = g
        elif g == delimiter:
            delimiter = ''
    # you may uncomment the next line to catch errors
    #if delimiter: raise ValueError("Unterminated string in data")
    compos.append(len(data))
    return [ data[compos[i]+1:compos[i+1]] for i in range(len(compos)-1)]

if __name__ == "__main__":
    a = 'par1=val1,par2=val2,par3="some text1, again some text2, again some text3",par4="some text",par5=val5'
    print(parts(a))

""" my output -->
['par1=val1', 'par2=val2', 'par3="some text1, again some text2, again some text3"', 'par4="some text"', 'par5=val5']
"""
commented: Great, thank you for your snippet. +1
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.