Wednesday, January 26, 2022

[FIXED] How to reset cumulative sum per group when a certain column is 0 in pandas

January 26, 2022 dataframe, numpy, pandas, pandas-groupby, python No comments

Issue

I have following dataframe in which I want to make a cumulative sum on a certain column (Value), together with a group by on field Group, but reset that cumulative sum to 0 when a value in another column (Quantity) is 0.

Group	Quantity	Value	Cumulative_sum
A	10	200	200
B	5	300	300
A	1	50	250
A	0	100	0
C	5	400	400
A	10	300	300
B	10	200	500
A	15	350	650

I have tried working with blocks like in the code below, but the issue is that it doesn't do the cumulative sum correctly:

blocks = df['Quantity'].eq(0)[::-1].cumsum()[::-1]

df['temp_field'] = (df.groupby(['Group', blocks])
   ['Value'].cumsum()
   .where(df['Quantity']!=0,df['Value'])
)

df['Cumulative_sum'] = np.where(df['Quantity'] == 0, 0, df['temp_field'])

Could anyone help with this?

Solution

For the given resetting condition, use groupby.cumsum to create a Reset grouper that tells us when Quantity hits 0 within each Group:

condition = df.Quantity.eq(0)
df['Reset'] = condition.groupby(df.Group).cumsum()

#   Group  Quantity  Value  Cumulative_sum  Reset
# 0     A        10    200             200      0
# 1     B         5    300             300      0
# 2     A         1     50             250      0
# 3     A         0    100               0      1
# 4     C         5    400             400      0
# 5     A        10    300             300      1
# 6     B        10    200             500      0
# 7     A        15    350             650      1

mask the Value column whenever the resetting condition is met and use another groupby.cumsum on both Group and Reset:

df['Cumul'] = df.Value.mask(condition, 0).groupby([df.Group, df.Reset]).cumsum()

#   Group  Quantity  Value  Cumulative_sum  Reset  Cumul
# 0     A        10    200             200      0    200
# 1     B         5    300             300      0    300
# 2     A         1     50             250      0    250
# 3     A         0    100               0      1      0
# 4     C         5    400             400      0    400
# 5     A        10    300             300      1    300
# 6     B        10    200             500      0    500
# 7     A        15    350             650      1    650

Answered By - tdy

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, January 26, 2022

[FIXED] How to reset cumulative sum per group when a certain column is 0 in pandas

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels