Issue
I have a pandas dataframe that looks like this
ID country month revenue profit ebit
234 USA 201409 10 5 3
344 USA 201409 9 7 2
532 UK 201410 20 10 5
129 Canada 201411 15 10 5
I want to group by ID, country, month and count the IDs per month and country and sum the revenue, profit, ebit. The output for the above data would be:
country month revenue profit ebit count
USA 201409 19 12 5 2
UK 201409 20 10 5 1
Canada 201411 15 10 5 1
I have tried different variations of groupby, sum and count functions of pandas but I am unable to figure out how to apply groupby sum and count all together to give the result as shown. Please share any ideas that you might have. Thanks!
Solution
It can be done using pivot_table
this way:
>>> df1=pd.pivot_table(df, index=['country','month'],values=['revenue','profit','ebit'],aggfunc=np.sum)
>>> df1
ebit profit revenue
country month
Canada 201411 5 10 15
UK 201410 5 10 20
USA 201409 5 12 19
>>> df2=pd.pivot_table(df, index=['country','month'], values='ID',aggfunc=len).rename('count')
>>> df2
country month
Canada 201411 1
UK 201410 1
USA 201409 2
>>> pd.concat([df1,df2],axis=1)
ebit profit revenue count
country month
Canada 201411 5 10 15 1
UK 201410 5 10 20 1
USA 201409 5 12 19 2
UPDATE
It can be done in one-line using pivot_table
and providing a dict of functions to apply to each column in the aggfunc
argument:
pd.pivot_table(
df,
index=['country','month'],
aggfunc={'revenue': np.sum, 'profit': np.sum, 'ebit': np.sum, 'ID': len}
).rename(columns={'ID': 'count'})
count ebit profit revenue
country month
Canada 201411 1 5 10 15
UK 201410 1 5 10 20
USA 201409 2 5 12 19
Answered By - Mabel Villalba
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.