Name the airline with the most total passengers

Question

medsibouh 0 Newbie Poster

5 Years Ago

hello guys I'm trying to make a code that return the name of the flight that has the most total number of passengers from a file :

Alitalia Rome 180
Alitalia Pisa 82
Germanwings Munich 96
Germanwings Frankfurt 163
NorwegianAir Bergen 202
Wizzair London 184
Wizzair Frankfurt 83
Wizzair Lisbon 198

and the OUTPUT should be this :

Wizzair 465     # because this flight has : 184 + 83 + 198 = 465 passenger  and it's the flight with the most total passenger.

here is the code :

# importing pandas as pd 
import pandas as pd
from collections import counter

# Creating the dataframe 
df = pd.read_csv("airlines.csv")

most_total = 0
counts = 0

for iter, n in enumerate(df.split('\n')):
    most_total += counts + 1
    print(df) 
    print(most_total)

and I got thi error:

ImportError: cannot import name 'counter' from 'collections' (C:\Python\Python38-32\lib\collections\__init__.py)

any idea guys ????

c c++ python

2 Contributors
14 Replies
878 Views
3 Days Discussion Span
Latest Post 5 Years Ago Latest Post by Reverend Jim

All 14 Replies

Reverend Jim 5,242 Hi, I'm Jim, one of DaniWeb's moderators.

5 Years Ago

I've never user pandas or collections but I noticed that although

from collections import counter

does not work,

from collections import Counter

does (upper case c in Counter). Perhaps that will help. I also noticed that the data you posted is not comma delimited,

Reverend Jim 5,242 Hi, I'm Jim, one of DaniWeb's moderators.

5 Years Ago

Although I don't know why you are importing collections when you aren't using them, and I think you are over-complicating things by using pandas dataframes for such a simple application. I suggest you write your process as pseudo-code before you try to write actual code. Debug your pseudo-code on pencil and paper first with a small dataset example.

From what I can see you need to add a header line to your CSV to name the columns. For example, if your file looks like

Airline,Flights
Alitalia Rome,180
Alitalia Pisa,82
Germanwings Munich,96
Germanwings Frankfurt,163
NorwegianAir Bergen,202
Wizzair London,184
Wizzair Frankfurt,83
Wizzair Lisbon,198

then to iterate through the records you could do

for airline,flights in zip(df['Airline'],df['Flights']):
    print(airline,flights)

There are likely other ways to do this but I got this from a brief look at the 'getting started' guide. Let me know how it goes and where you are stuck and we can take it from there. I'll check back in a couple of hours.

Edited 5 Years Ago by Reverend Jim

medsibouh commented: Good +0

Reverend Jim 5,242 Hi, I'm Jim, one of DaniWeb's moderators.

5 Years Ago

That's sort of pseudo-code but still too code-ish. Try to write it like you are telling a person how to do it step-wise with pencil and paper. Your pseudo-code shouldn't have terms like 'open file', 'string', 'integer', or whatever a 'boucle' is.

There is a very useful feature in Python called a Dictionary. A dictionary uses key-value pairs. Unlike a list which you index by an integer, a dictionary is indexed by a key which can be any type. If you use the airline name as the key and the number of flights as the value you can keep a running total for each airline. The only gotcha is that you have to check if the key already exists, so your code looks like

totals = {}     # create an empty dictionary

If you split each line into three fields named airline, route and flights then you can do

if airline not in totals: totals[airline] = 0
totals[airline] += flights

So take that and try to figure out the rest.

Reverend Jim 5,242 Hi, I'm Jim, one of DaniWeb's moderators.

5 Years Ago

If you are going to use pandas then forget the standard file open and just do

df = pd.read_csv("airlines.txt")

and keep in mind thatin your for loop, Airline will contain both the airline name and the route name. For example, one line of

Alitalia Rome      180

will give you "Alitalia Rome". You might want to try

for route,flights in zip(df['Airline'],df['Flights']):

and then separate the airline portion by

airline,*rest = route.split()

If you are unfamiliar with this expression, what it does is to split the string using a space as a delimiter. The first token (the airline) goes into airline and the remaining tokens go into rest. This saves you from an error if the route has more than one word.

Once you have the dictionary built you'll still need to iterate through it to find the entry with the largest number of flights.

Edited 5 Years Ago by Reverend Jim

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

medsibouh 0 Newbie Poster · Answer 1 · 2020-04-13T02:31:57+00:00

@Reverend Jim thank you dude so much, you're really saving my hope.

I sew an example about using pandas and I thought that the exercice could be easier with it, but u right I'm still trying searching, please if you found something else please share it with me bro. Thanks a lot

score 0 · Answer 2 · 2020-04-13T06:00:06+00:00

Reverend Jim 5,242 Hi, I'm Jim, one of DaniWeb's moderators.

5 Years Ago

Let's have a look at what you have now following my suggestions first.

medsibouh 0 Newbie Poster · Answer 3 · 2020-04-13T06:08:09+00:00

All right, I'll try them. Then I'll share what I got here. Thank you

medsibouh 0 Newbie Poster · Answer 4 · 2020-04-13T07:03:23+00:00

Here is my pseudo code:

   Open the data file
    Read the file with all its
   (Strings,integers)

 Initiate the most_total nbr 
 Of passengers

 Make a boucle 'for' flights, most_total

    Check the nbr of passengers in each 
        flight

Find the greatest sum according to the 
 same flight

   Print (the flight, the sum of all 

    pssengers of this flight)  

  End

Here is it Mr.Jim, could you check it ou .
My probleme is how to count numbers according to the same name I don't have any idea about collecting numbers according to strings.

medsibouh 0 Newbie Poster · Answer 5 · 2020-04-13T16:45:14+00:00

medsibouh 0 Newbie Poster

5 Years Ago

All right sir. I'll try that.
Thank you so much for your time.

Edited 5 Years Ago by medsibouh

medsibouh 0 Newbie Poster · Answer 6 · 2020-04-14T02:08:18+00:00

Hello sir Jim I tried a pseudo code with pencil and paper as u adviced me and I reached that code :

with open("airlines.txt", "r") as file:
    content = file.readlines()
    content = sorted(content)

df = 0
df = str(content)
totals = {}    #empty dic
Airline = ''   # initialize the strings
Flights = ''
for Airline, Flights in zip(df['Airline'],df['Flights']):

    if airline not in totals:
        totals[airline] = 0
        totals[airline] += flights

        print(airline,flights)

I got this error , I feel close to the right result but I don't know exactly what I'm missing here :

for Airline, Flights in zip(df['Airline'],df['Flights']):
TypeError: string indices must be integers

I tried to convert some indices in the code but without any success.

medsibouh 0 Newbie Poster · Answer 7 · 2020-04-14T02:52:05+00:00

medsibouh 0 Newbie Poster

5 Years Ago

Okay I'll try it now.thank you

medsibouh 0 Newbie Poster · Answer 8 · 2020-04-14T04:26:51+00:00

fibally I tride this with help of a friend :

from collections import defaultdict

totals = defaultdict(int)
with open('f.txt') as data:
    for line in data:
        if len(line.split()) != 3:
            continue
        [airline, _route, passengers] = line.split()
        totals[airline] += int(passengers)
print("%s  %d" % max(totals.items(), key=lambda pair: pair[0]))

I had a problem with pandas function its not compatible with my python software
thank you Mr. JIM for your time and your stack trace , here is the final version of the code it works perfectly

score 0 · Answer 9 · 2020-04-14T08:24:45+00:00

I'm glad we finally got there. I seem to have had a blind spot about the meaning of the last column though. I hope that didn't cause too much confusion.

score 1 · Answer 10 · 2020-04-16T16:43:47+00:00

If you are interested, here is one implementation using pandas. Note that the first line in the data file must be

Routes,Passengers

import pandas
import sys

df = pandas.read_csv("airlines.csv")

# calculate total flights per airline

totals = {}

for route,passengers in zip(df['Routes'],df['Passengers']):
    airline,*rest = route.split()
    totals[airline] = totals.get(airline,0) + passengers

# print out the totals for each airline and find max total

maxpassengers = 0

for airline,passengers in totals.items():
    print(airline, passengers)
    if passengers > maxpassengers:
        maxpassengers = passengers
        maxairline = airline

print("\nmax airline is",maxairline,'with',maxpassengers,'passengers')

Name the airline with the most total passengers

Recommended Answers Collapse Answers

All 14 Replies

Recommended Answers