Issue
I am just trying to get some data and re-arrange it. Here is my dataset showing foods and the scores they received in different years. What I want to do is find the foods which had the lowest and highest scores on average and track their scores across the years.
The next part is where I am a little stuck: I'd need to display the max and min foods from the original dataset that would show all the columns - Food, year, Score. This is what I have tried, but it doesn't work:
menu[menu.Food == Max & menu.Food == Min]
Basically I want it to display something like the below in a dataframe, so I can plot some graphs (i.e. I want to then make a line plot which would display the years on the x-axis, scores on the y-axis and plot the lowest scoring food and the top scoring food:
If you guys know any other ways of doing this, please let me know!
Any help would be appreciated
Solution
You can select first and last rows per year
by Series.duplicated
with invert mask and chain by |
for bitwise OR
, filter in boolean indexing
:
df1 = df[~df['year'].duplicated() | ~df['year'].duplicated(keep='last')]
Solution with groupby:
df1 = df.groupby('year').agg(['first','last']).stack(1).droplevel(1).reset_index()
If need minimal and maximal per year
s:
df = df.sort_values(['year','food'])
df2 = df[~df['year'].duplicated() | ~df['year'].duplicated(keep='last')]
Solution with groupby:
df2 = df.loc[df.groupby('year')['Score'].agg(['idxmax','idxmin']).stack()]
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.