Issue
I have the following dataframe (sample):
import pandas as pd
data = [['A', '2022-09-01 10:00:00', False, 2], ['A', '2022-09-01 14:00:00', False, 3],
['B', '2022-09-01 13:00:00', False, 1], ['B', '2022-09-01 16:00:00', True, 4]]
df = pd.DataFrame(data = data, columns = ['group', 'date', 'indicator', 'value'])
group date indicator value
0 A 2022-09-01 10:00:00 False 2
1 A 2022-09-01 14:00:00 False 3
2 B 2022-09-01 13:00:00 False 1
3 B 2022-09-01 16:00:00 True 4
I would like to fill in the missing dates between dates hourly. So each hour that is missing between dates should be filled and the values should be the same as the previous data. Here is the desired output:
data = [['A', '2022-09-01 10:00:00', False, 2], ['A', '2022-09-01 11:00:00', False, 2],
['A', '2022-09-01 12:00:00', False, 2], ['A', '2022-09-01 13:00:00', False, 2],
['A', '2022-09-01 14:00:00', False, 3],
['B', '2022-09-01 13:00:00', False, 1], ['B', '2022-09-01 14:00:00', False, 1],
['B', '2022-09-01 15:00:00', False, 1], ['B', '2022-09-01 16:00:00', True, 4]]
df_desired = pd.DataFrame(data = data, columns = ['group', 'date', 'indicator', 'value'])
group date indicator value
0 A 2022-09-01 10:00:00 False 2
1 A 2022-09-01 11:00:00 False 2
2 A 2022-09-01 12:00:00 False 2
3 A 2022-09-01 13:00:00 False 2
4 A 2022-09-01 14:00:00 False 3
5 B 2022-09-01 13:00:00 False 1
6 B 2022-09-01 14:00:00 False 1
7 B 2022-09-01 15:00:00 False 1
8 B 2022-09-01 16:00:00 True 4
So I was wondering if it is possible to fill the missing dates hourly per group with the previous value in column value using Pandas
?
Solution
You can use:
df['date'] = pd.to_datetime(df['date'])
out = (df
.groupby('group', as_index=False, group_keys=False)
.apply(lambda g: g.set_index('date')
.reindex(pd.date_range(g['date'].min(),
g['date'].max(),
freq='H'))
.ffill(downcast='infer').reset_index()
)
.reset_index(drop=True)
)
output:
index group indicator value
0 2022-09-01 10:00:00 A False 2
1 2022-09-01 11:00:00 A False 2
2 2022-09-01 12:00:00 A False 2
3 2022-09-01 13:00:00 A False 2
4 2022-09-01 14:00:00 A False 3
5 2022-09-01 13:00:00 B False 1
6 2022-09-01 14:00:00 B False 1
7 2022-09-01 15:00:00 B False 1
8 2022-09-01 16:00:00 B True 4
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.