Hi all,

I have this dilemma, if I receive strings on the form of:

<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>

And I want to take the ranking only from the strings (i.e. AA, CA, TA-A)

How can I do it using string methods like split or strip if those can do it ?

Notice that the first characters until the ":" are common between all (<Ranking: ), yet the ranking length and the name length may differ.

Thanks in Advance.

Recommended Answers

All 28 Replies

Personally I'd use regular expressions like so:

>>> import re
>>> regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')
>>> input_data = """<Ranking: AA (John)>
... <Ranking: CA (Peter)>
... <Ranking: TA-A (Samantha)>
... """
>>> for each_entry in input_data.split('\n'):
...     regex_match = regex_compiled.match(each_entry)
...     if regex_match:
...         print 'Ranking: % 5s  Name: %s' % (regex_match.group(1), regex_match.group(2))
...     
Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha
>>>

If you need any specific part of that explained I'd be happy to do so.

If you're really in need of using string methods I'd do something like the following:

>>> for each_entry in input_data.split('\n'):
...     rank_search = '<Ranking: '
...     idx = each_entry.find(rank_search)
...     if idx != -1: # string.find returns -1 when not found
...         idx += len(rank_search) # Increase idx to where rank_search ends
...         end_idx = each_entry.find(' (', idx) # idx as starting point for find
...         print 'Ranking: % 5s  Name: %s' % (each_entry[idx:end_idx], each_entry[end_idx+2:-2])
...     
Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha
>>>

Personally I'd use regular expressions like so:

>>> import re
>>> regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')
>>> input_data = """<Ranking: AA (John)>
... <Ranking: CA (Peter)>
... <Ranking: TA-A (Samantha)>
... """
>>> for each_entry in input_data.split('\n'):
...     regex_match = regex_compiled.match(each_entry)
...     if regex_match:
...         print 'Ranking: % 5s  Name: %s' % (regex_match.group(1), regex_match.group(2))
...     
Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha
>>>

If you need any specific part of that explained I'd be happy to do so.

If you're really in need of using string methods I'd do something like the following:

>>> for each_entry in input_data.split('\n'):
...     rank_search = '<Ranking: '
...     idx = each_entry.find(rank_search)
...     if idx != -1: # string.find returns -1 when not found
...         idx += len(rank_search) # Increase idx to where rank_search ends
...         end_idx = each_entry.find(' (', idx) # idx as starting point for find
...         print 'Ranking: % 5s  Name: %s' % (each_entry[idx:end_idx], each_entry[end_idx+2:-2])
...     
Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha
>>>

Thank you very much for the fast response, I will test it as soon as I get home and Ill report back here with a rep.

Thanks again

If there is no space in the rank or name you can do it this simple way:

mydata = """\
<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

rank_list = []
for line in mydata.split('\n'):
    print line, line.split()  # testing
    if line:
        rank_list.append(line.split()[1])

print
print rank_list
    
"""my result -->
<Ranking: AA (John)> ['<Ranking:', 'AA', '(John)>']
<Ranking: CA (Peter)> ['<Ranking:', 'CA', '(Peter)>']
<Ranking: TA-A (Samantha)> ['<Ranking:', 'TA-A', '(Samantha)>']
 []

['AA', 'CA', 'TA-A']
"""

Thanks jlm699 and sneekula, both ways did what I wanted exacly

I am sorry to re-post in a solved thread, but I have a quick yet simple question that has to do with the problem above:


How to index records of a certain field ? i.e.:

for the sample code you provided guys, how can I print the ranking before the last ? Example:

if I want to go through the records as you did, I printed the ranking as:

Ranking:    AA  Name: John
Ranking:    CA  Name: Peter
Ranking:  TA-A  Name: Samantha

Now How can I get the ranking CA only (Peter's, the one before the last) ?

Ypu could use something simple like this ...

mydata = """\
<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

for line in mydata.split('\n'):
    if line:
        head, rank, name = line.split()
    if 'Peter' in name:
        print "Name: Peter  Rank:", rank  # Name: Peter  Rank: CA

Ypu could use something simple like this ...

mydata = """\
<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

for line in mydata.split('\n'):
    if line:
        head, rank, name = line.split()
    if 'Peter' in name:
        print "Name: Peter  Rank:", rank  # Name: Peter  Rank: CA

Thank you.
But what if I don't know the name in this case (i.e. Peter), I just want to print the ranking before the last from the given data table ?

And can this way be applied on class objects as well ?

Thanks

instead of

if 'Peter' in name:

have

if 'CA' in rank:
    print 'Name: %s Rank: %s'%(name, rank)

instead of

if 'Peter' in name:

have

if 'CA' in rank:
    print 'Name: %s Rank: %s'%(name, rank)

If I don't know the name, how could I know the ranking ? :p

How can I print the ranking before the last, literally ?

It will be easier to modifiy the jlm699 code:

import re

regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

data_list = input_data.split('\n')
# remove any empty list item
data_list.remove("")
# slice into second to last item
data_list = data_list[len(data_list)-2:-1]
print data_list  # see what you got

# left the for loop in so you can look at larger slices
for each_entry in data_list:
    regex_match = regex_compiled.match(each_entry)
    if regex_match:
        sf = 'Ranking: % 5s  Name: %s'
        print sf % (regex_match.group(1), regex_match.group(2))

''' total output -->
['<Ranking: CA (Peter)>']
Ranking:    CA  Name: Peter
'''

It will be easier to modifiy the jlm699 code:

import re

regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

data_list = input_data.split('\n')
# remove any empty list item
data_list.remove("")
# slice into second to last item
data_list = data_list[len(data_list)-2:-1]
print data_list  # see what you got

# left the for loop in so you can look at larger slices
for each_entry in data_list:
    regex_match = regex_compiled.match(each_entry)
    if regex_match:
        sf = 'Ranking: % 5s  Name: %s'
        print sf % (regex_match.group(1), regex_match.group(2))

''' total output -->
['<Ranking: CA (Peter)>']
Ranking:    CA  Name: Peter
'''

Thanks Lard.


But this cannot be applied on tables as well, does it ?
because I got this error:

data_list.remove("")
ValueError: list.remove(x): x not in list

or there is another way to handle data in tables, like if we have same data but in the form of a table

ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith


And I want to do the same here ( printing or fetching the ranking of the record before the last)

It will be easier to modifiy the jlm699 code:

import re

regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

data_list = input_data.split('\n')
# remove any empty list item
data_list.remove("")
# slice into second to last item
data_list = data_list[len(data_list)-2:-1]
print data_list  # see what you got

# left the for loop in so you can look at larger slices
for each_entry in data_list:
    regex_match = regex_compiled.match(each_entry)
    if regex_match:
        sf = 'Ranking: % 5s  Name: %s'
        print sf % (regex_match.group(1), regex_match.group(2))

''' total output -->
['<Ranking: CA (Peter)>']
Ranking:    CA  Name: Peter
'''

Thanks Lard.


But this cannot be applied on tables as well, does it ?
because I got this error:

data_list.remove("")
ValueError: list.remove(x): x not in list

or there is another way to handle data in tables, like if we have same data but in the form of a table

ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith


And I want to do the same here ( printing or fetching the ranking of the record before the last)

It will be easier to modifiy the jlm699 code:

import re

regex_compiled = re.compile('^<Ranking: (.*) \((.*)\)>$')

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>
"""

data_list = input_data.split('\n')
# remove any empty list item
data_list.remove("")
# slice into second to last item
data_list = data_list[len(data_list)-2:-1]
print data_list  # see what you got

# left the for loop in so you can look at larger slices
for each_entry in data_list:
    regex_match = regex_compiled.match(each_entry)
    if regex_match:
        sf = 'Ranking: % 5s  Name: %s'
        print sf % (regex_match.group(1), regex_match.group(2))

''' total output -->
['<Ranking: CA (Peter)>']
Ranking:    CA  Name: Peter
'''

Thanks Lard.


But this cannot be applied on tables as well, does it ?
because I got this error:

data_list.remove("")
ValueError: list.remove(x): x not in list

or there is another way to handle data in tables, like if we have same data but in the form of a table

ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith


And I want to do the same here ( printing or fetching the ranking of the record before the last)

The way the test input_data is written it will introduce an empty list item when you split it. So change it to ...

input_data = """<Ranking: AA (John)>
<Ranking: CA (Peter)>
<Ranking: TA-A (Samantha)>"""

... and remove the line data_list.remove("")

Or you can use this ...

# remove any empty list item
if "" in data_list:
    data_list.remove("")

What about my question in the last reply, about the table ?

For nice looking tables you can use the % string formatting.

For nice looking tables you can use the % string formatting.

Thanks but my question wasn't about how to create a table, it was how to fetch from the table, fetch a value in the record before the last.

For nice looking tables you can use the % string formatting.

Thanks but my question wasn't about how to create a table, it was how to fetch from the table, fetch a value in the record before the last.

Thanks but my question wasn't about how to create a table, it was how to fetch from the table, fetch a value in the record before the last.

You mean your data is a string in the form of a table?

You mean your data is a string in the form of a table?

yes indeed, just like the example table I draw above.

Well, you can do like that ...

mystr = """\
ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith"""

mylist = mystr.split('\n')

# slice out the second to last item
print mylist[-2:-1]  # ['3 || AC || John']

Well, you can do like that ...

mystr = """\
ID || rank || Name
------------------------------
1 || CC || Peter
2 || AA || Samantha
3 || AC || John
4 || CD || Smith"""

mylist = mystr.split('\n')

# slice out the second to last item
print mylist[-2:-1]  # ['3 || AC || John']

And how did you know that the wanted record is between 2 and last? that's my point I mentioned earlier, PLUS, you are treating the table as a list here, how about a real table ??

You keep asking about the second to last item. Slicing this way will always give you the second to last item no matter how many other items are before it.

Python does not have an object type called a table, you must be thinking about another language or maybe a database.

You keep asking about the second to last item.

Python does not have an object type called a table, you must be thinking about another language or maybe a database.

No it is python, used to fetch from gql ( google's sql-like). Now this sounds confusing ( and for me as well) when I fetched the table using GQL query and stored the data into a variable, whats the type of data would be ? list ? string ? I don't know to be honest.

I have tried all the helpful samples you guys provided however, printing the item before the last gives me one letter, trying to play with the indexes gives the same.

To illustrate the basics ...

mystr = "abcdefg"
print type(mystr)
print mystr
print mystr[-2:-1]

print '-'*10

mylist = ['fred', 'peter', 'gustav', 'george']
print type(mylist)
print mylist
print mylist[-2:-1]

"""my output -->
<type 'str'>
abcdefg
f
----------
<type 'list'>
['fred', 'peter', 'gustav', 'george']
['gustav']
"""

What type of objects does the query return? You may have to convert to a list to access it in this way.

query = db.GqlQuery("SELECT *" .....etc.
print type(query)
ctr = 0
for q in query:
    if ctr < 2:
        print type(q)
    ctr += 1

Thanks a million guys, How did I miss

type

command :-s


So so far, I was able - as a test - to check type, length and contents of the list, I got something quite confusing (for me) which is:

when it gives me the length of the list, it is 42 for example, while when I counted it "manually" it appears that the 42 is the number of entries in the list ( not characters ) for example, it counted CC as 1, AB as another and so on.


When I print the list I get something like:
AABBCDACADDACDDDBBCD


Where the red highlighted is the last item I entered to the list (FILO), now the question is how can I always print the item with green font (the one before the last ) ?

Thanks for your patience, you are really helpful.

Again, you could use string slicing ...

s = "AABBCDACADDACDDDBBCD"

print s[2:4]  # BB
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.