Issue
Get count of transactions by region where customers had greater than 10000$ sales and less than 10000$ sales. (hint: create 2 columns for getting count of transaction ids - one where customers had greater than 10000 $ sales and another where customers had less than 10000 $ sales)
I am having trouble figuring out how to go about this problem as transaction_id has all unique values and how do I groupby region in Pandas
df_3 = dataset.groupby(['region', 'transaction_id'], as_index=False)['sales'].sum()
df_3
above code give the following output
and then from df_3 I got the sales values >10,000 and <10000 But I don't know how to get count of transactions by region
Solution
I hope this is the solution you are finding. Do upvote and accept solution if it help.
dataset.loc[dataset["sales"] < 10000, "10k_above"] = 0
dataset.loc[dataset["sales"] >= 10000, "10k_above"] = 1
df_results = dataset.groupby(by=["region"], as_index=False).agg(
transaction_count = ("transaction_id", "count"),
above_10k_count = ("10k_above", "sum")
below_10k_count = ("10k_above", lambda x: (x==0).sum())
)
Answered By - Raymond Toh
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.