Issue
I know I can calculate unique count, mean, median, kurtosis, and skewness individually and merge into one dataframe, but that's a lot of steps comparing to R data.table, where you can just calculate all these in one step. Is there any way I can do a groupby
and calculate all these in one step in Python as well?
df <- data[,.(ItemCount= uniqueN(Item),
Median_val = median(Value),
Avg_val = mean(Value),
Skew_val = skewness(Value),
Kurt_val = kurtosis(Value)),.(Year, Category)][order(Year,Category)]
Solution
With RootTwo's referrence link provided in the comment section, I was able to solve my own question. Note that for kurtosis, we cannot use aggfunc="kurt"
. It will return an error" 'SeriesGroupBy' object has no attribute 'kurt'
Below is my solution:
df = (data.groupby(['Year', 'Category'], as_index=False)
.agg(ItemCount = pd.NamedAgg(column="Item", aggfunc="nunique"),
mean = pd.NamedAgg(column="Value", aggfunc="mean"),
median = pd.NamedAgg(column="Value", aggfunc="median"),
skew = pd.NamedAgg(column="Value", aggfunc="skew"),
kurt = pd.NamedAgg(column="Value", aggfunc=pd.DataFrame.kurt))
)
Answered By - Jiamei
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.