Issue
In pandas dataframe, I'm totally confused of how to use the method of groupby()
over consecutive duplicates by sum values in column
Let's say I have the following DataFrame df
:
index type value
0 profit 11
1 profit 10
2 loss -5
3 profit 50
4 profit 15
5 loss -30
6 loss -25
7 loss -10
what I'm looking to is:
index type grand
0 profit 21 # total of 11 + 10 = 21
1 loss -5 # the same value as this row NOT consecutive duplicated
2 profit 65 # total of 50 + 15 = 65
3 loss -65 # total of -30 -25 -10 = -65
What I tried to do:
df['grand'] = df.groupby(df['type'].ne(df['type'].shift()).cumsum()).cumcount()
but it gives me counting the consecutive duplicated
I tried to iterate through the rows with several solutions but all were failed
Thanks so much!
Solution
Instead of .cumcount()
use sum:
out = (
df.groupby(df["type"].ne(df["type"].shift()).cumsum(), as_index=False)
.agg({"type": "first", "value": "sum"})
.rename(columns={"value": "grand"})
)
print(out)
Prints:
type grand
0 profit 21
1 loss -5
2 profit 65
3 loss -65
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.