I have a text I want to search for all it's upper case letters, then present these letters unique and sorted. I welcome any suggestions.

Recommended Answers

All 9 Replies

Member Avatar for Mouche

Here's an exampmle:

alphabet = "abcdefghijklmnopqrstuvwxyz"
text1 = "LQZYdMDHPEdWOAVUCBDdsfTEFgdfGIRKwerJMSONPX"
text2 = ""
upper_letters = []

#P
for letter in text1:
    # Check to see if the letter is already in upper_letters
    if upper_letters.count(letter) == 1: 
        continue
    # If letter is uppercase, add it to upper_letters
    if letter in alphabet.upper():
        upper_letters.append(letter)
# Put list in alphabetical order
upper_letters.sort()
# Put list into string with nothing between the letters
text2 = "".join(upper_letters)
# Print the original string and the resulting string.
print "This is the text1 string: '%s'" % (text1)
print "This string contains only the uppercase letters in text1: '%s'" % (text2)

This takes a string of random letters, puts each upper case in a list (unless it's already in there), sorts the list, puts it into a string, and then prints it out.

I hope that helped.

You can use Python's regular expression module re, very powerful for text processing, but there is a somewhat steep learning curve ahead!

# exploring Python's regular expression module re

import re

# find all upper case letters in a text:
text = "This text has Upper and Lower Case Letters"
all_uppers = re.findall("[A-Z]", text)

# make elements unique and sorted
all_uppers = sorted(list(set(all_uppers)))
print all_uppers  # ['C', 'L', 'T', 'U']

Or if regular expressions are too painful, some kind of compromise:

def count_caps(s):
    d = {}
    caps = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    for i in caps:
        d[i] = s.count(i)
    return d

def print_caps(s):
   d = count_caps(s)

   # not 'for i in d' becuase dictionaries don't have guaranteed order
   for i in "ABCDEFGHIJKLMNOPQRSTUVWXYZ":  
      if d[i]:
        print i + ":" + str(d[i]), 

print_caps("ASDHbajksfhHHLASD")
A:2 D:2 H:3 L:1 S:2

Jeff

Thanks Jeff, that is even fancier than I needed. Will come in handy though!

The re module seems to be nice, really don't have much trouble seeing the syntax, but I have a little problem understanding the one liner about making a list unique and sort.

My humble contribution using a list comprehension:

# create a unique sorted list of all upper case letters in text
text = "This text has Upper and Lower Case Letters"
unique_list = []
[unique_list.append(c) for c in text if c.isupper() and c not in unique_list]
print sorted(unique_list)   # ['C', 'L', 'T', 'U']

Here is the breakdown of vegaseat's one liner as I see it:

text = "This text has Upper and Lower Case Letters"

uppers_raw = re.findall("[A-Z]", text)  # ['T', 'U', 'L', 'C', 'L']
uppers_set = set(uppers_raw)            # set(['C', 'U', 'T', 'L'])
uppers_list = list(uppers_set)          # ['C', 'U', 'T', 'L']
uppers_sorted = sorted(uppers_list)     # ['C', 'L', 'T', 'U']
Member Avatar for Mouche

So set() removes duplicates?

So set() removes duplicates?

Yes, going fom a list to a set then back to a list removes duplicates from the list, but it will change the order of the elements. So, only use this little trick when order does not matter.

Looks like this code comes in handy too:

[I]# make elements in the list all_uppers unique and sorted[/I]
all_uppers = sorted(list(set(all_uppers)))

Thanks to all who helped!

That's a lot of work. If I set the following:

a = 'ABCDEFGHIJKLMNOPQRSTUVWXYX'
b= 'A Rat In The House Might Eat The Ice Cream'

Then the following returns the result the original poster wanted:

>>> set(sorted([ch for ch in b if ch in a]))
set()

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.