Issue
I have a df
as follows:
dates values
2020-01-01 00:15:00 87.321
2020-01-01 00:30:00 87.818
2020-01-01 00:45:00 88.514
2020-01-01 01:00:00 89.608
2020-01-01 01:15:00 90.802
2020-01-01 01:30:00 91.896
2020-01-01 01:45:00 92.393
2020-01-01 02:00:00 91.995
2020-01-01 02:15:00 90.504
2020-01-01 02:30:00 88.216
2020-01-01 02:45:00 85.929
2020-01-01 03:00:00 84.238
I want to just keep hourly values when the minute is 00
and the values
occurring before it must be added.
Example: For finding the value at 2020-01-01 01:00:00
, the values from 2020-01-01 00:15:00
to 2020-01-01 01:00:00
should be added (87.321+87.818+88.514+59.608 = 353.261). Similarly, for finding the value at 2020-01-01 02:00:00
, the values from 2020-01-01 01:15:00
to 2020-01-01 02:00:00
should be added (90.802+91.896+92.393+91.995 = 348.887)
Desired output
dates values
2020-01-01 01:00:00 353.261
2020-01-01 02:00:00 348.887
2020-01-01 03:00:00 333.67
I used df['dates'].dt.minute.eq(0)
to obtain the boolean masking, but I am unable to find a way to add them.
Solution
hourly = df.set_index('dates') \ # Set the dates as index
.resample('1H', closed='right', label='right') \ # Resample, so that you have one value for each hour
.sum() # Set the sum of values as new value
hourly = hourly.reset_index() # If you want to have the dates as column again
Answered By - flurble
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.