DaniWeb IT Discussion Community

DaniWeb IT Discussion Community (http://www.daniweb.com/forums/)
-   Python (http://www.daniweb.com/forums/forum114.html)
-   -   Finding top 10 values in csv file (http://www.daniweb.com/forums/thread123291.html)

ssDimensionss May 8th, 2008 11:00 pm
Finding top 10 values in csv file
 
hi, im learning to use python at the moment and i came over a question where it gives me a large csv file with names of companies and how much they are earning and i was asked to find the top 10 companies..i orignially did this:

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
max_company=''
max_earnings=0.0
num = 0
for row in data:
    entry = row[6]
    num += 1
    if float(entry) > max_earnings:
        max_earnings = float(entry)
        max_company = row[0]




However it seems to only give me the top company. I also tried to use a whle loop but it didn tur out right.. is there a way to reiterate through every row without the top company another 9 times? plz help! thx

jrcagle May 9th, 2008 12:43 am
Re: Finding top 10 values in csv file
 
I'm a little confused about the use of the "num" variable.

I would probably do this:

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
earnings = [(row[0], float(row[6])) for row in data]
earnings.sort(key = lambda x: x[1])
earnings.reverse()
earnings = earnings[:10]

line 6 generates a list of the companies and their earnings. Line 7 sorts these using the earnings as the key. Line 8 puts the highest earnings first. And line 9 trims the list to the top 10.

HTH,
Jeff

ssDimensionss May 9th, 2008 1:11 am
Re: Finding top 10 values in csv file
 
Quote:

Originally Posted by jrcagle (Post 603227)
I'm a little confused about the use of the "num" variable.

I would probably do this:

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
earnings = [(row[0], float(row[6])) for row in data]
earnings.sort(key = lambda x: x[1])
earnings.reverse()
earnings = earnings[:10]

line 6 generates a list of the companies and their earnings. Line 7 sorts these using the earnings as the key. Line 8 puts the highest earnings first. And line 9 trims the list to the top 10.

HTH,
Jeff






hey thx for the help, also can u tell me if i can do it like this?

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
max_company=''
max_earnings=0.0
num = 0
count = 0
while num < 10:
    for row in data:
        count += 1
        entry = row[6]
        if float(entry) > max_earnings:
            max_earnings = float(entry)
            max_company = row[0]
           
    num+=1
    print max_company

it still doesn work but if i cant find a way to exclude the line of data that is the max company, then when it reiterates it should print the second largest. and so on. thx again!

jrcagle May 9th, 2008 8:05 am
Re: Finding top 10 values in csv file
 
I don't really recommend that iterative approach, because all it accomplishes is printing the top ten to the screen (if you can get it to work!), whereas the real prize is to have the top ten in a list somewhere so that you can print it, sort it, etc.

Jeff

ssDimensionss May 10th, 2008 4:41 am
Re: Finding top 10 values in csv file
 
hey the thing is we are learning loops and iterations rigth now and i think we're supposed to use iteration to do it if possible hmm does anyone knw how to exclude a line of code from csv file? i think it should be something similar to excluding the header

jrcagle May 10th, 2008 1:13 pm
Re: Finding top 10 values in csv file
 
Oh, well if you must iterate, then here's the basic idea:

* create an empty list.
* run through the data by rows.
* if the current row's earnings are greater than the smallest earnings in your list:
--- add the current row's earnings and name of company to the list.
--- sort the list.
--- trim the list to 10 items
* Voila!

I'll leave the coding to you.

Jeff


All times are GMT -4. The time now is 7:35 pm.

Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC