Issue
In dataframe have 4 columns col_A,col_B,col_C,col_D.Need to group the columns(col_A,col_B,col_C) and aggregate mean by col_D. Below is the code snippet I tried and it worked
df.groupby(['col_A','col_B','col_C']).agg({'col_D':'mean'}).reset_index()
But in addition to the above result, also require the group by count of ('col_A','col_B','col_C') along with aggregation. Any help on this please.
Solution
Using Named Aggregation:
result = (
df.groupby(['col_A', 'col_B', 'col_C'], as_index=False)
.agg(mean=('col_D', 'mean'), count=('col_D', 'count'))
)
For the count
columns, you have 2 choices in choosing the aggregate function:
count=('col_D', 'count')
will ignore any NaN value incol_D
count=('col_D', 'size')
will include NaN values incol_D
Answered By - Code Different
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.