Issue
So, I am studying pandas and one of the tasks was to sort values of the table by borough and holidays(data in this column is only "Y"(if it is holiday) and "N"(if not). Also I avereged pickups(as task asked), but I wanna show data specifically on "Y".
import pandas as pd
df = pd.read_csv("2_taxi_nyc.csv")
df.groupby(["borough", "hday"]).agg({"pickups": 'mean'})type here
Result:
I tried using df.query, but it didnt work (error was smth like exp is not supposed to be a "class bool" and in another try a "list"). After that I
ve been stuck and still can`t find a solution or even an idea how to solve this in the Net. I hope someone will help me. I apologize for my poor english(not a first language). Thanks in advance.
Solution
You have 2 ways to do that:
- filter results after aggregation
- filter data before aggregation
Case 1:
>>> (df.groupby(['borough', 'hday'], as_index=False) # <- group labels as columns
.agg({'pickups': 'mean'}).query("hday == 'Y'")) # <- filter with query
borough hday pickups
1 Bronx Y 48.065868
3 Brooklyn Y 527.011976
5 EWR Y 0.041916
7 Manhattan Y 2035.928144
9 Queens Y 320.730539
11 Staten Island Y 1.497006
Case 2:
>>> df[df['hday'] == 'Y'].groupby('borough', as_index=False)['pickups'].mean()
borough pickups
0 Bronx 48.065868
1 Brooklyn 527.011976
2 EWR 0.041916
3 Manhattan 2035.928144
4 Queens 320.730539
5 Staten Island 1.497006
Source dataset: uber_nyc_enriched.csv
Answered By - Corralien
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.