Issue
I have two data frames:
df1 = pd.DataFrame({'ID':['A', 'A', 'B', 'B'],
'Date':['31.08.2023', '12.09.2023', '13.09.2023', '20.08.2023']})
df2 = pd.DataFrame({'ID':['A', 'A', 'A', 'B', 'B'],
'Date':['30.08.2023', '14.09.2023', '10.09.2023', '28.09.2023', '19.08.2023']})
I want to add two new columns to df1 with the nearest higher and lower date in every ID, so the desire output in this example will look like this:
ID | DATE | DATE_UP | DATE_DOWN |
---|---|---|---|
A | 31.08.2023 | 10.09.2023 | 30.08.2023 |
A | 12.09.2023 | 14.09.2023 | 10.09.2023 |
B | 13.09.2023 | 28.09.2023 | 19.08.2023 |
B | 20.08.2023 | 28.09.2023 | 19.08.2023 |
I know there is a pandas function merge_asof(), but it work only for the nearest merge. I will be greatfull for any ideas how to do it efficiently.
Solution
Code
you can solve it by merge_asof
. merge_asof doesnt work only for the nearest merge. Even nearest is not the default, the default is backward.
at first make datetime column named 'Date1' and sort
df1['Date1'] = pd.to_datetime(df1['Date'], dayfirst=True)
df1 = df1.sort_values('Date1')
df2['Date1'] = pd.to_datetime(df2['Date'], dayfirst=True)
df2 = df2.sort_values('Date1')
df1:
ID Date Date1
3 B 20.08.2023 2023-08-20
0 A 31.08.2023 2023-08-31
1 A 12.09.2023 2023-09-12
2 B 13.09.2023 2023-09-13
df2:
ID Date Date1
4 B 19.08.2023 2023-08-19
0 A 30.08.2023 2023-08-30
2 A 10.09.2023 2023-09-10
1 A 14.09.2023 2023-09-14
3 B 28.09.2023 2023-09-28
next use merge_asof
forward & backward(default)
tmp = pd.merge_asof(df1.reset_index(), df2, on='Date1', by='ID',
direction='forward', suffixes=['', '_up'])
out = pd.merge_asof(tmp, df2, on='Date1', by='ID', suffixes=['', '_down'])\
.drop('Date1', axis=1).set_index('index').sort_index().rename_axis('')
out:
ID Date Date_up Date_down
0 A 31.08.2023 10.09.2023 30.08.2023
1 A 12.09.2023 14.09.2023 10.09.2023
2 B 13.09.2023 28.09.2023 19.08.2023
3 B 20.08.2023 28.09.2023 19.08.2023
Answered By - Panda Kim
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.