954,557 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Using nested list with unicode data without it showing as such

Context

I'm retrieving data from Google Analytics (via the python-googleanalytics library, as Google's API is way too complex for me right now) and putting that data into a table using an HTML library.

Python experience: very low.

Problem

I have this:

[u'string', u'string'] [integer]
[u'string', u'string'] [integer]
[u'string', u'string'] [integer]
[u'string', u'string'] [integer]
[u'string', u'string'] [integer]

The strings end up in a table column together and the integers in a table column. That's what I want. However, I don't want it to show any of the punctuation marks or the unicode u, just the bare strings and integers, but still in separate table columns.

What I'm trying to do is iterate over the list and the lists within the list to take all the values and encode them as ascii, but this cannot be done on integers. Maybe I should iterate over the strings and integer lists separately, but how should I separate them?

clean = []
for rows in top10:
    for x in rows:
        for i in x:
            i = i.encode('ascii','ignore')
            i = str(i)
            clean.append(i)


Gives:Traceback (most recent call last):
File "C:\Python26\googlescrape.py", line 31, in
i = i.encode('ascii','ignore')
AttributeError: 'int' object has no attribute 'encode'

I would be grateful if someone could help me get to the next step.

fingerpainting
Newbie Poster
12 posts since Apr 2010
Reputation Points: 10
Solved Threads: 0
 

Did you try:

clean = []
for rows in top10:
    for x in rows:
        for i in x:
            i = str(i)
            i = i.encode('ascii','ignore')
            clean.append(i)
pyTony
pyMod
Moderator
5,359 posts since Apr 2010
Reputation Points: 782
Solved Threads: 852
 

Why not?

top10 = [
        [[u'abcde', u'fghij'], [1]],
        [[u'abcde', u'fghij'], [2]],
        [[u'abcde', u'fghij'], [3]],
        [[u'abcde', u'fghij'], [4]],
        [[u'abcde', u'fghij'], [5]]]

top10 = [[[str(a), str(b)], x] for [a, b], x in top10]

for item in top10:
    print item
Output:
[['abcde', 'fghij'], [1]]
[['abcde', 'fghij'], [2]]
[['abcde', 'fghij'], [3]]
[['abcde', 'fghij'], [4]]
[['abcde', 'fghij'], [5]]


Cheers and Happy coding

Beat_Slayer
Posting Pro in Training
405 posts since Jun 2010
Reputation Points: 30
Solved Threads: 105
 

Thanks for your help tonyjv and beat slayer.

When I do the str() first, I get each character with the strings in a separate table column and each string and integer in a separate table row. However, the list 'clean' does return the values I want, so I'll try and tweak the HTML output.

When I use the list comprehension, the initial output is perfectly represented without unicode. Here too I will probably just have to tweak the HTML output so it doesn't show the list brackets.

Thanks! Will mark this as solved.

fingerpainting
Newbie Poster
12 posts since Apr 2010
Reputation Points: 10
Solved Threads: 0
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You
View similar articles that have also been tagged: