Monday, November 14, 2022

[FIXED] Groupby sum and count on multiple columns in python

November 14, 2022 pandas, pandas-groupby, python, python-2.7, python-3.x No comments

Issue

I have a pandas dataframe that looks like this

ID     country   month   revenue  profit   ebit
234    USA       201409   10        5       3
344    USA       201409    9        7       2
532    UK        201410    20       10      5
129    Canada    201411    15       10      5

I want to group by ID, country, month and count the IDs per month and country and sum the revenue, profit, ebit. The output for the above data would be:

 country   month    revenue   profit  ebit   count
   USA     201409     19        12      5      2
   UK      201409     20        10      5      1
   Canada  201411     15        10      5      1

I have tried different variations of groupby, sum and count functions of pandas but I am unable to figure out how to apply groupby sum and count all together to give the result as shown. Please share any ideas that you might have. Thanks!

Solution

It can be done using pivot_table this way:

>>> df1=pd.pivot_table(df, index=['country','month'],values=['revenue','profit','ebit'],aggfunc=np.sum)
>>> df1 
                ebit  profit  revenue
country month                        
Canada  201411     5      10       15
UK      201410     5      10       20
USA     201409     5      12       19

>>> df2=pd.pivot_table(df, index=['country','month'], values='ID',aggfunc=len).rename('count')
>>> df2

country  month 
Canada   201411    1
UK       201410    1
USA      201409    2

>>> pd.concat([df1,df2],axis=1)

                ebit  profit  revenue  count
country month                               
Canada  201411     5      10       15      1
UK      201410     5      10       20      1
USA     201409     5      12       19      2

UPDATE

It can be done in one-line using pivot_table and providing a dict of functions to apply to each column in the aggfunc argument:

pd.pivot_table(
   df,
   index=['country','month'],
   aggfunc={'revenue': np.sum, 'profit': np.sum, 'ebit': np.sum, 'ID': len}
).rename(columns={'ID': 'count'})

                count  ebit  profit  revenue
country month                               
Canada  201411      1     5      10       15
UK      201410      1     5      10       20
USA     201409      2     5      12       19

Answered By - Mabel Villalba

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, November 14, 2022

[FIXED] Groupby sum and count on multiple columns in python

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels