String Methods

Question

El Duke 0 Junior Poster in Training

14 Years Ago

Hi all,

I have this dilemma, if I receive strings on the form of:

<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>

And I want to take the ranking only from the strings (i.e. AA, CA, TA-A)

How can I do it using string methods like split or strip if those can do it ?

Notice that the first characters until the ":" are common between all (<Ranking: ), yet the ranking length and the name length may differ.

Thanks in Advance.

python

Edited 14 Years Ago by El Duke because: n/a

7 Contributors
28 Replies
133 Views
1 Week Discussion Span
Latest Post 14 Years Ago Latest Post by vegaseat

All 28 Replies

jlm699 320 Veteran Poster

14 Years Ago

Personally I'd use regular expressions like so:

>>> import re
>>> regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')
>>> input_data = """<Ranking: AA (John)>
... <Ranking: CA (Peter)>
... <Ranking: TA-A (Samantha)>
... """
>>> for each_entry in input_data.split('\n'):
...     regex_match = regex_compiled.match(each_entry)
...     if regex_match:
...         print 'Ranking: % 5s  Name: %s' % (regex_match.group(1), regex_match.group(2))
...     
Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha
>>>

If you need any specific part of that explained I'd be happy to do so.

If you're really in need of using string methods I'd do something like the following:

>>> for each_entry in input_data.split('\n'):
...     rank_search = '<Ranking: '
...     idx = each_entry.find(rank_search)
...     if idx != -1: # string.find returns -1 when not found
...         idx += len(rank_search) # Increase idx to where rank_search ends
...         end_idx = each_entry.find(' (', idx) # idx as starting point for find
...         print 'Ranking: % 5s  Name: %s' % (each_entry[idx:end_idx], each_entry[end_idx+2:-2])
...     
Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha
>>>

Edited 14 Years Ago by jlm699 because: n/a

sneekula 969 Nearly a Posting Maven

14 Years Ago

If there is no space in the rank or name you can do it this simple way:

mydata = """\
<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

rank_list = []
for line in mydata.split('\n'):
    print line, line.split()  # testing
    if line:
        rank_list.append(line.split()[1])

print
print rank_list
    
"""my result -->
<Ranking: AA (John)> ['<Ranking:', 'AA', '(John)>']
<Ranking: CA (Peter)> ['<Ranking:', 'CA', '(Peter)>']
<Ranking: TA-A (Samantha)> ['<Ranking:', 'TA-A', '(Samantha)>']
 []

['AA', 'CA', 'TA-A']
"""

vegaseat 1,735 DaniWeb's Hypocrite

14 Years Ago

Ypu could use something simple like this ...

mydata = """\
<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

for line in mydata.split('\n'):
    if line:
        head, rank, name = line.split()
    if 'Peter' in name:
        print "Name: Peter  Rank:", rank  # Name: Peter  Rank: CA

vegaseat 1,735 DaniWeb's Hypocrite

14 Years Ago

To illustrate the basics ...

mystr = "abcdefg"
print type(mystr)
print mystr
print mystr[-2:-1]

print '-'*10

mylist = ['fred', 'peter', 'gustav', 'george']
print type(mylist)
print mylist
print mylist[-2:-1]

"""my output -->
<type 'str'>
abcdefg
f
----------
<type 'list'>
['fred', 'peter', 'gustav', 'george']
['gustav']
"""

woooee 814 Nearly a Posting Maven

14 Years Ago

What type of objects does the query return? You may have to convert to a list to access it in this way.

query = db.GqlQuery("SELECT *" .....etc.
print type(query)
ctr = 0
for q in query:
    if ctr < 2:
        print type(q)
    ctr += 1

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

El Duke 0 Junior Poster in Training · Answer 1 · 2009-12-16T03:37:54+00:00

Personally I'd use regular expressions like so:

>>> import re
>>> regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')
>>> input_data = """<Ranking: AA (John)>
... <Ranking: CA (Peter)>
... <Ranking: TA-A (Samantha)>
... """
>>> for each_entry in input_data.split('\n'):
...     regex_match = regex_compiled.match(each_entry)
...     if regex_match:
...         print 'Ranking: % 5s  Name: %s' % (regex_match.group(1), regex_match.group(2))
...     
Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha
>>>

If you need any specific part of that explained I'd be happy to do so.

If you're really in need of using string methods I'd do something like the following:

>>> for each_entry in input_data.split('\n'):
...     rank_search = '<Ranking: '
...     idx = each_entry.find(rank_search)
...     if idx != -1: # string.find returns -1 when not found
...         idx += len(rank_search) # Increase idx to where rank_search ends
...         end_idx = each_entry.find(' (', idx) # idx as starting point for find
...         print 'Ranking: % 5s  Name: %s' % (each_entry[idx:end_idx], each_entry[end_idx+2:-2])
...     
Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha
>>>

Thank you very much for the fast response, I will test it as soon as I get home and Ill report back here with a rep.

Thanks again

El Duke 0 Junior Poster in Training · Answer 2 · 2009-12-16T08:46:09+00:00

Thanks jlm699 and sneekula, both ways did what I wanted exacly

El Duke 0 Junior Poster in Training · Answer 3 · 2009-12-17T21:40:05+00:00

I am sorry to re-post in a solved thread, but I have a quick yet simple question that has to do with the problem above:

How to index records of a certain field ? i.e.:

for the sample code you provided guys, how can I print the ranking before the last ? Example:

if I want to go through the records as you did, I printed the ranking as:

Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha

Now How can I get the ranking CA only (Peter's, the one before the last) ?

El Duke 0 Junior Poster in Training · Answer 4 · 2009-12-18T03:34:20+00:00

Ypu could use something simple like this ...

mydata = """\
<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

for line in mydata.split('\n'):
    if line:
        head, rank, name = line.split()
    if 'Peter' in name:
        print "Name: Peter  Rank:", rank  # Name: Peter  Rank: CA

Thank you.
But what if I don't know the name in this case (i.e. Peter), I just want to print the ranking before the last from the given data table ?

And can this way be applied on class objects as well ?

Thanks

baki100 -1 Junior Poster in Training · Answer 5 · 2009-12-18T16:27:11+00:00

instead of

if 'Peter' in name:

have

if 'CA' in rank:
    print 'Name: %s Rank: %s'%(name, rank)

El Duke 0 Junior Poster in Training · Answer 6 · 2009-12-18T21:05:34+00:00

instead of

if 'Peter' in name:

have

if 'CA' in rank:
    print 'Name: %s Rank: %s'%(name, rank)

If I don't know the name, how could I know the ranking ? :p

How can I print the ranking before the last, literally ?

Lardmeister 461 Posting Virtuoso · Answer 7 · 2009-12-19T00:32:04+00:00

It will be easier to modifiy the jlm699 code:

import re

regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

data_list = input_data.split('\n')
# remove any empty list item
data_list.remove("")
# slice into second to last item
data_list = data_list[len(data_list)-2:-1]
print data_list  # see what you got

# left the for loop in so you can look at larger slices
for each_entry in data_list:
    regex_match = regex_compiled.match(each_entry)
    if regex_match:
        sf = 'Ranking: % 5s  Name: %s'
        print sf % (regex_match.group(1), regex_match.group(2))

''' total output -->
['<Ranking: CA (Peter)>']
Ranking:    CA  Name: Peter
'''

El Duke 0 Junior Poster in Training · Answer 8 · 2009-12-19T04:04:20+00:00

It will be easier to modifiy the jlm699 code:

import re

regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

data_list = input_data.split('\n')
# remove any empty list item
data_list.remove("")
# slice into second to last item
data_list = data_list[len(data_list)-2:-1]
print data_list  # see what you got

# left the for loop in so you can look at larger slices
for each_entry in data_list:
    regex_match = regex_compiled.match(each_entry)
    if regex_match:
        sf = 'Ranking: % 5s  Name: %s'
        print sf % (regex_match.group(1), regex_match.group(2))

''' total output -->
['<Ranking: CA (Peter)>']
Ranking:    CA  Name: Peter
'''

Thanks Lard.

But this cannot be applied on tables as well, does it ?
because I got this error:

data_list.remove("")
ValueError: list.remove(x): x not in list

or there is another way to handle data in tables, like if we have same data but in the form of a table

ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith

And I want to do the same here ( printing or fetching the ranking of the record before the last)

El Duke 0 Junior Poster in Training · Answer 9 · 2009-12-19T04:17:53+00:00

It will be easier to modifiy the jlm699 code:

import re

regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

data_list = input_data.split('\n')
# remove any empty list item
data_list.remove("")
# slice into second to last item
data_list = data_list[len(data_list)-2:-1]
print data_list  # see what you got

# left the for loop in so you can look at larger slices
for each_entry in data_list:
    regex_match = regex_compiled.match(each_entry)
    if regex_match:
        sf = 'Ranking: % 5s  Name: %s'
        print sf % (regex_match.group(1), regex_match.group(2))

''' total output -->
['<Ranking: CA (Peter)>']
Ranking:    CA  Name: Peter
'''

Thanks Lard.

But this cannot be applied on tables as well, does it ?
because I got this error:

data_list.remove("")
ValueError: list.remove(x): x not in list

or there is another way to handle data in tables, like if we have same data but in the form of a table

ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith

And I want to do the same here ( printing or fetching the ranking of the record before the last)

El Duke 0 Junior Poster in Training · Answer 10 · 2009-12-19T04:33:28+00:00

It will be easier to modifiy the jlm699 code:

import re

regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

data_list = input_data.split('\n')
# remove any empty list item
data_list.remove("")
# slice into second to last item
data_list = data_list[len(data_list)-2:-1]
print data_list  # see what you got

# left the for loop in so you can look at larger slices
for each_entry in data_list:
    regex_match = regex_compiled.match(each_entry)
    if regex_match:
        sf = 'Ranking: % 5s  Name: %s'
        print sf % (regex_match.group(1), regex_match.group(2))

''' total output -->
['<Ranking: CA (Peter)>']
Ranking:    CA  Name: Peter
'''

Thanks Lard.

But this cannot be applied on tables as well, does it ?
because I got this error:

data_list.remove("")
ValueError: list.remove(x): x not in list

or there is another way to handle data in tables, like if we have same data but in the form of a table

ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith

And I want to do the same here ( printing or fetching the ranking of the record before the last)

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 11 · 2009-12-19T20:36:38+00:00

The way the test input_data is written it will introduce an empty list item when you split it. So change it to ...

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>"""

... and remove the line data_list.remove("")

Or you can use this ...

# remove any empty list item
if "" in data_list:
    data_list.remove("")

El Duke 0 Junior Poster in Training · Answer 12 · 2009-12-20T03:08:58+00:00

What about my question in the last reply, about the table ?

Lardmeister 461 Posting Virtuoso · Answer 13 · 2009-12-20T05:19:24+00:00

For nice looking tables you can use the % string formatting.

El Duke 0 Junior Poster in Training · Answer 14 · 2009-12-21T02:44:24+00:00

For nice looking tables you can use the % string formatting.

Thanks but my question wasn't about how to create a table, it was how to fetch from the table, fetch a value in the record before the last.

El Duke 0 Junior Poster in Training · Answer 15 · 2009-12-21T02:56:58+00:00

For nice looking tables you can use the % string formatting.

Thanks but my question wasn't about how to create a table, it was how to fetch from the table, fetch a value in the record before the last.

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 16 · 2009-12-21T19:22:35+00:00

Thanks but my question wasn't about how to create a table, it was how to fetch from the table, fetch a value in the record before the last.

You mean your data is a string in the form of a table?

El Duke 0 Junior Poster in Training · Answer 17 · 2009-12-21T19:37:04+00:00

You mean your data is a string in the form of a table?

yes indeed, just like the example table I draw above.

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 18 · 2009-12-21T20:22:45+00:00

Well, you can do like that ...

mystr = """\
ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith"""

mylist = mystr.split('\n')

# slice out the second to last item
print mylist[-2:-1]  # ['3 || AC || John']

El Duke 0 Junior Poster in Training · Answer 19 · 2009-12-22T02:57:43+00:00

Well, you can do like that ...

mystr = """\
ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith"""

mylist = mystr.split('\n')

# slice out the second to last item
print mylist[-2:-1]  # ['3 || AC || John']

And how did you know that the wanted record is between 2 and last? that's my point I mentioned earlier, PLUS, you are treating the table as a list here, how about a real table ??

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 20 · 2009-12-22T03:06:11+00:00

You keep asking about the second to last item. Slicing this way will always give you the second to last item no matter how many other items are before it.

Python does not have an object type called a table, you must be thinking about another language or maybe a database.

El Duke 0 Junior Poster in Training · Answer 21 · 2009-12-22T03:12:19+00:00

You keep asking about the second to last item.
Python does not have an object type called a table, you must be thinking about another language or maybe a database.

No it is python, used to fetch from gql ( google's sql-like). Now this sounds confusing ( and for me as well) when I fetched the table using GQL query and stored the data into a variable, whats the type of data would be ? list ? string ? I don't know to be honest.

I have tried all the helpful samples you guys provided however, printing the item before the last gives me one letter, trying to play with the indexes gives the same.

El Duke 0 Junior Poster in Training · Answer 22 · 2009-12-23T03:12:55+00:00

Thanks a million guys, How did I miss

type

command :-s

So so far, I was able - as a test - to check type, length and contents of the list, I got something quite confusing (for me) which is:

when it gives me the length of the list, it is 42 for example, while when I counted it "manually" it appears that the 42 is the number of entries in the list ( not characters ) for example, it counted CC as 1, AB as another and so on.

When I print the list I get something like:
AABBCDACADDACDDDBBCD

Where the red highlighted is the last item I entered to the list (FILO), now the question is how can I always print the item with green font (the one before the last ) ?

Thanks for your patience, you are really helpful.

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 23 · 2009-12-24T20:08:04+00:00

Again, you could use string slicing ...

s = "AABBCDACADDACDDDBBCD"

print s[2:4]  # BB

String Methods

Recommended Answers Collapse Answers

All 28 Replies

Recommended Answers