Issue
I have some time series data in a pandas data frame like this:
begin | end | mw_values |
---|---|---|
2021-09-14 11:16:00 | 2021-09-14 11:27:11 | 0 |
2021-09-14 11:27:11 | 2021-09-14 11:30:00 | 100 |
2021-09-14 11:30:00 | 2021-09-14 11:33:59 | 1200 |
2021-09-14 11:33:59 | 2021-09-14 11:39:42 | 600 |
2021-09-14 11:39:42 | 2021-09-14 11:59:59 | 400 |
I need the sum of the mw_values distributed into 15 minutes time slots like this:
time_slots_15_min | sum_mw_values |
---|---|
2021-09-14 11:00 | 0 |
2021-09-14 11:15 | 100 |
2021-09-14 11:30 | 2200 |
2021-09-14 11:45 | 0 |
2021-09-14 12:00 | 0 |
Does someone have any idea how I can achieve this?
Note that the intervals between begin and end may overlap 2 time slots. Then the value must be involved in the sum of the time slot where it begins; e.g. the mw_value of 400 in the example from above.
Solution
You can resample your dataframe so sum the data in 15 minute bins. Then you can reindex that frame so it matches your desired start/end/frequency times.
freq = "15min"
new_index = pd.date_range(
"2021-09-14 11:00:00", "2021-09-14 12:00:00", freq=freq
)
out = (
df.resample(freq, on="begin")["mw_values"]
.sum()
.reindex(new_index, fill_value=0)
.to_frame("sum_mw_values")
)
print(out)
sum_mw_values
2021-09-14 11:00:00 0
2021-09-14 11:15:00 100
2021-09-14 11:30:00 2200
2021-09-14 11:45:00 0
2021-09-14 12:00:00 0
Answered By - Cameron Riddell
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.