hello everyone!
i need help a bit ...
////
County/Region 2007Q3 2008Q3 Yr/Yr%

Los Angeles 13,583 17,073 25.70%
Orange 3,882 5,692 46.60%
San Diego 5,673 7,062 24.50%
Riverside 9,250 11,714 26.60%
San Bernardino 7,038 9,110 29.40%
Ventura 1,377 1,676 21.70%
Imperial 259 568 119.30%
San Francisco 252 353 40.10%
////

this is a section off of a text document that i have to tweak with ...
so what i am supposed to do is take the city and place it into a dictionary and set the value as the second set of numbers ... so basically

{"los angeles" : "17,073" , "san diego" : "5,692 ...etc }

the list is just a txt document, separated by \t and \n

how can i create a dictionary out of this???
i TRIED splitting the .txt , but i encountered a problem ...
the "Los Angeles" was split .... and i cant have that if i want this to work correctly cause this txt document is a HUGE LIST ...

plz help?

Recommended Answers

All 5 Replies

separated by \t and \n

If the entries are split by tabs then you can use line.split('\t') on each line to separate those values. If they're only split by one space then it'll take some concerted effort to identify the numbers and then read backwards to get the city name.

You might try adapting something like this, coded off the cuff:

>>> line
'San Diego 5,673 7,062 24.50%'
>>> L = line.split(" ")
>>> for i in range(len(L)):
...     if L[i][0].isdigit():
...             break
... else:
...     print "Bad imput line: %s" % s
...     # Whatever else you do to ignore this line
...
>>> city = " ".join(L[0:i])
>>> col2 = L[i+1]
>>> print city, col2
San Diego 7,062
>>>

The above assumes a space separates the data, in the line.split(" ") call. This code should work even for long city names like "Salt Lake City" and "Ciudad de dos Hermanos Perez".

You could also split, subtract 3 (for the end fields) from the length of the sub-string list, which would yield the number of elements to join for the city name. Providing each record is exactly the same.
Or, you can start from the other end
yr=substrs[-1]
q3_2008=substrs[-2]
q3_2007=substrs[-3]

Played a little

line = "Some City Name 17,000 23 24.5%"
data = line.split(" ")

print " ".join(data[0:-3]), data[-2]

I believe that is what you were seeking.

Here's another variation:

lineList = ['Los Angeles 13,583 17,073 25.70%',
            'Orange 3,882 5,692 46.60%',
            'San Diego 5,673 7,062 24.50%',
            'Riverside 9,250 11,714 26.60%',
            'San Bernardino 7,038 9,110 29.40%',
            'Ventura 1,377 1,676 21.70%',
            'Imperial 259 568 119.30%',
            'San Francisco 252 353 40.10%']

dd = {}
for line in lineList:
    for i, letter in enumerate(line):
        if letter.isdigit():
            dd[line[:i].strip()] = line[i:].split()[0]
            break
print dd

Output:

>>> {'San Diego': '5,673', 'Imperial': '259', 'Ventura': '1,377', 'Los Angeles': '13,583', 'Orange': '3,882', 'San Francisco': '252', 'Riverside': '9,250', 'San Bernardino': '7,038'}
>>>

Of course, the city name must not have any numbers.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.