Issue
choose every first row of each index of multiindexing pandas dataframe.
grouped = ecommerce[["category_id", "brand", "price"]].groupby(by=["category_id", "brand"]).mean()
grouped_sort = grouped.sort_values(by=["category_id", "price"], ascending=False)
now on this data frame I want to choose in each category just the first brands with the highest price.
Can anyone help me?
Solution
The following code can help:
gsgb = grouped_sort.copy()
gsgb = gsgb.groupby(level=0)
print(type(gsgb))
gsgb.head()
for cat, df in gsgb:
display(df.sort_values(by=["price"], ascending=False).reset_index().iloc[0])
Working:
It basically loops over all categories in the grouped dataframe and then sort values based on price
, then reset index and finally choose the one with max price.
Answered By - sotmot
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.