User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Python section within the Software Development category of DaniWeb, a massive community of 401,680 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 3,521 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Views: 413 | Replies: 5
Reply
Join Date: May 2008
Posts: 4
Reputation: ssDimensionss is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
ssDimensionss ssDimensionss is offline Offline
Newbie Poster

Finding top 10 values in csv file

  #1  
May 8th, 2008
hi, im learning to use python at the moment and i came over a question where it gives me a large csv file with names of companies and how much they are earning and i was asked to find the top 10 companies..i orignially did this:

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
max_company=''
max_earnings=0.0
num = 0
for row in data:
    entry = row[6]
    num += 1
    if float(entry) > max_earnings:
        max_earnings = float(entry)
        max_company = row[0]




However it seems to only give me the top company. I also tried to use a whle loop but it didn tur out right.. is there a way to reiterate through every row without the top company another 9 times? plz help! thx
Last edited by ssDimensionss : May 8th, 2008 at 11:03 pm.
AddThis Social Bookmark Button
Reply With Quote  
Join Date: Jul 2006
Posts: 562
Reputation: jrcagle is on a distinguished road 
Rep Power: 4
Solved Threads: 72
jrcagle jrcagle is offline Offline
Posting Pro

Re: Finding top 10 values in csv file

  #2  
May 9th, 2008
I'm a little confused about the use of the "num" variable.

I would probably do this:

  1. import urllib
  2. import csv
  3. temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
  4. data = csv.reader(temp_url)
  5. header = data.next()
  6. earnings = [(row[0], float(row[6])) for row in data]
  7. earnings.sort(key = lambda x: x[1])
  8. earnings.reverse()
  9. earnings = earnings[:10]

line 6 generates a list of the companies and their earnings. Line 7 sorts these using the earnings as the key. Line 8 puts the highest earnings first. And line 9 trims the list to the top 10.

HTH,
Jeff
Reply With Quote  
Join Date: May 2008
Posts: 4
Reputation: ssDimensionss is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
ssDimensionss ssDimensionss is offline Offline
Newbie Poster

Re: Finding top 10 values in csv file

  #3  
May 9th, 2008
Originally Posted by jrcagle View Post
I'm a little confused about the use of the "num" variable.

I would probably do this:

  1. import urllib
  2. import csv
  3. temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
  4. data = csv.reader(temp_url)
  5. header = data.next()
  6. earnings = [(row[0], float(row[6])) for row in data]
  7. earnings.sort(key = lambda x: x[1])
  8. earnings.reverse()
  9. earnings = earnings[:10]

line 6 generates a list of the companies and their earnings. Line 7 sorts these using the earnings as the key. Line 8 puts the highest earnings first. And line 9 trims the list to the top 10.

HTH,
Jeff





hey thx for the help, also can u tell me if i can do it like this?

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
max_company=''
max_earnings=0.0
num = 0
count = 0
while num < 10:
    for row in data:
        count += 1
        entry = row[6]
        if float(entry) > max_earnings:
            max_earnings = float(entry)
            max_company = row[0]
            
    num+=1
    print max_company

it still doesn work but if i cant find a way to exclude the line of data that is the max company, then when it reiterates it should print the second largest. and so on. thx again!
Reply With Quote  
Join Date: Jul 2006
Posts: 562
Reputation: jrcagle is on a distinguished road 
Rep Power: 4
Solved Threads: 72
jrcagle jrcagle is offline Offline
Posting Pro

Re: Finding top 10 values in csv file

  #4  
May 9th, 2008
I don't really recommend that iterative approach, because all it accomplishes is printing the top ten to the screen (if you can get it to work!), whereas the real prize is to have the top ten in a list somewhere so that you can print it, sort it, etc.

Jeff
Reply With Quote  
Join Date: May 2008
Posts: 4
Reputation: ssDimensionss is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
ssDimensionss ssDimensionss is offline Offline
Newbie Poster

Re: Finding top 10 values in csv file

  #5  
May 10th, 2008
hey the thing is we are learning loops and iterations rigth now and i think we're supposed to use iteration to do it if possible hmm does anyone knw how to exclude a line of code from csv file? i think it should be something similar to excluding the header
Reply With Quote  
Join Date: Jul 2006
Posts: 562
Reputation: jrcagle is on a distinguished road 
Rep Power: 4
Solved Threads: 72
jrcagle jrcagle is offline Offline
Posting Pro

Re: Finding top 10 values in csv file

  #6  
May 10th, 2008
Oh, well if you must iterate, then here's the basic idea:

* create an empty list.
* run through the data by rows.
* if the current row's earnings are greater than the smallest earnings in your list:
--- add the current row's earnings and name of company to the list.
--- sort the list.
--- trim the list to 10 items
* Voila!

I'll leave the coding to you.

Jeff
Reply With Quote  
Reply

Only community members can participate in forum threads. You must register or log in to contribute.

DaniWeb Python Marketplace
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 

Thread Tools Display Modes

Other Threads in the Python Forum

All times are GMT -4. The time now is 7:33 am.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC