Sunday, June 5, 2022

[FIXED] Pandas - sort and head inside groupby

June 05, 2022 dataframe, group-by, pandas, python No comments

Issue

I have following dataframe:

                       uniq_id    value
2016-12-26 11:03:10        001      342
2016-12-26 11:03:13        004        5
2016-12-26 12:03:13        005       14
2016-12-26 12:03:13        008      114
2016-12-27 11:03:10        009      343
2016-12-27 11:03:13        013        5
2016-12-27 12:03:13        016      124
2016-12-27 12:03:13        018      114

And i need get top N records for each day sorted by value. Something like this (for N=2):

2016-12-26   001   342
             008   114
2016-12-27   009   343
             016   124

Please suggest right way to do that in pandas 0.19.x

Solution

Unfortunately there is no yet such method as DataFrameGroupBy.nlargest(), which would allow us to do the following:

df.groupby(...).nlargest(2, columns=['value'])

So here is a bit ugly, but working solution:

In [73]: df.set_index(df.index.normalize()).reset_index().sort_values(['index','value'], ascending=[1,0]).groupby('index').head(2)
Out[73]:
       index  uniq_id  value
0 2016-12-26        1    342
3 2016-12-26        8    114
4 2016-12-27        9    343
6 2016-12-27       16    124

PS i think there must be a better one...

UPDATE: if your DF wouldn't have duplicated index values, the following solution should work as well:

In [117]: df
Out[117]:
                     uniq_id  value
2016-12-26 11:03:10        1    342
2016-12-26 11:03:13        4      5
2016-12-26 12:03:13        5     14
2016-12-26 12:33:13        8    114    # <-- i've intentionally changed this index value
2016-12-27 11:03:10        9    343
2016-12-27 11:03:13       13      5
2016-12-27 12:03:13       16    124
2016-12-27 12:33:13       18    114    # <-- i've intentionally changed this index value

In [118]: df.groupby(pd.TimeGrouper('D')).apply(lambda x: x.nlargest(2, 'value')).reset_index(level=1, drop=1)
Out[118]:
            uniq_id  value
2016-12-26        1    342
2016-12-26        8    114
2016-12-27        9    343
2016-12-27       16    124

Answered By - MaxU - stop genocide of UA

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, June 5, 2022

[FIXED] Pandas - sort and head inside groupby

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels