Hi. I have an external file where I have the results of a dictionary:

google.com {'facebook.com': 230, 'yahoo.com': 9, 'fifa.org': 67, 'msn.com': 3}
yahoo.com {'apps.facebook.com': 13, 'msn.com': 9, 'myp2p.eu': 2}

The results show the search engines, the links have been clicked from the search engine and the number of time those links have been visited or clicked from the search engine. How can I plot this? Would it be easier to change the format of the data and then plot or what? Any help, suggestions or advices is highly appreciated.

Recommended Answers

All 13 Replies

Any idea how to change the format to something like this?

google.com       facebook.com 230
                 yahoo.com 9
                 fifa.org 67
                 msn.com 3

yahoo.com        apps.facebook.com 13
                 msn.com 9
                 myp2p.eu 2

Any idea how to change the format to something like this?

You can use the new string formatting has a lot of powerful features an was new in python 2.6
http://docs.python.org/library/string.html#formatstrings
An example.

google = {
    'facebook.com': 230,
    'yahoo.com': 9,
    'fifa.org': 67,
    'msn.com': 3}

print 'google.com'
for name in sorted(google):
    print("\t\t{name:11}{score:>10.4f}".format(name=name, score=google[name]))

"""Out-->
google.com
	facebook.com  230.0000
	fifa.org      67.0000
	msn.com        3.0000
	yahoo.com      9.0000
"""

If you have saved the data from your program, I would change thd file output to use cPicle or JSON formats. That leads to cleaner, more maintainable code.

If you have saved the data from your program, I would change thd file output to use cPicle or JSON formats. That leads to cleaner, more maintainable code.

the file is .txt file directly saved from the program with about 4 thousand entries.

the file is .txt file directly saved from the program with about 4 thousand entries.

Actually the results of I have is a dictionary in dictionary, as you can see from the first post.

Use YAML or JSON.

You told they where in file, so they are just bunch of letters until parsed to corrrect data. The type of data type they do not matter as that information is only present in the grammar of data. If you want to hard code things that can be done, but better to use rigth format instead of fixing bad one. Do you have possibility to change saving right or not? What is the using purpose? Hobby, work or home work?

commented: agree +13

You told they where in file, so they are just bunch of letters until parsed to corrrect data. The type of data type they do not matter as that information is only present in the grammar of data. If you want to hard code things that can be done, but better to use rigth format instead of fixing bad one. Do you have possibility to change saving right or not? What is the using purpose? Hobby, work or home work?

It is from work. Actually I could change the format, but the problem is the data is too huge and would take long hours to run the changes again.

Ok, as it is not for home work and not so difficult here is the reading solution:

data = """google.com {'facebook.com': 230, 'yahoo.com': 9, 'fifa.org': 67, 'msn.com': 3}
yahoo.com {'apps.facebook.com': 13, 'msn.com': 9, 'myp2p.eu': 2}"""

constructed = dict()
for line in data.splitlines():
    key, content = line.split(None,1)
    constructed[key] = eval(content)
print data
print
print

for item in sorted(constructed):
    print item,'\t',
    print '\n\t\t'.join('%s %s' % info for info in sorted(constructed[item].items()))
    print

It's probably easier to pull the data you need for plotting out of a list:

raw_data = """\
google.com {'facebook.com': 230, 'yahoo.com': 9, 'fifa.org': 67, 'msn.com': 3}
yahoo.com {'apps.facebook.com': 13, 'msn.com': 9, 'myp2p.eu': 2}"""

filename = "aadata.txt"
# save to test file
fout = open(filename, "w")
fout.write(raw_data)
fout.close()
# read the test file line by line and process
# save each line list in data_lines list
data_lines = []
for line in open(filename):
    line = line.rstrip()
    # remove the following characters
    line = line.replace('{', '')
    line = line.replace('}', '')
    line = line.replace(':', '')
    line = line.replace(',', '')
    line = line.replace('\'', '')
    
    #print line, type(line)  # test
    print line.split()
    
    data_lines.append(line.split())
    
'''my result -->
['google.com', 'facebook.com', '230', 'yahoo.com', '9', 'fifa.org', '67', 'msn.com', '3']
['yahoo.com', 'apps.facebook.com', '13', 'msn.com', '9', 'myp2p.eu', '2']
'''

if i were you, i will just convert and use cpicle or anydb

Ok, as it is not for home work and not so difficult here is the reading solution:

data = """google.com {'facebook.com': 230, 'yahoo.com': 9, 'fifa.org': 67, 'msn.com': 3}
yahoo.com {'apps.facebook.com': 13, 'msn.com': 9, 'myp2p.eu': 2}"""

constructed = dict()
for line in data.splitlines():
    key, content = line.split(None,1)
    constructed[key] = eval(content)
print data
print
print

for item in sorted(constructed):
    print item,'\t',
    print '\n\t\t'.join('%s %s' % info for info in sorted(constructed[item].items()))
    print

Thank you all. tonyjv's solution did the trick. Thanks again for all the help.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.