Issue
There are two dataframes, need to extract the Nearest upcoming Expiry date from Dataframe2 based on Active date in Dataframe 1 to obtain the correct Value.
This is a sample. Original data contains thousands of rows
Dataframe 1
df_1 = pd.DataFrame({'Category': ['A','B'],
'Active date': ['2021-06-20','2021-06-25']})
Dataframe 2
df_2 = pd.DataFrame({'Category': ['A','A','A','A','A','B','B','B'],
'Expiry date': ['2021-05-22','2021-06-23','2021-06-24','2021-06-28','2021-07-26','2021-06-27','2021-06-28','2021-08-29'],
'Value': [20,21,23,45,12,34,17,34]})
Final Output -
The code I was trying -
df = pd.merge(df_1, df_2, on='Category', how='inner')
#Removed all the dates which are less than Active date
df = df.loc[(df_1['Active Date'] <= df_2['Expiry Date'])]
Solution
I believe this solution keeps a lot of your existing code and will accomplish what you are looking for.
df_1 = pd.DataFrame({'Category': ['A','B'],
'Active date': ['2021-06-20','2021-06-25']})
df_2 = pd.DataFrame({'Category': ['A','A','A','A','A','B','B','B'],
'Expiry date': ['2021-05-22','2021-06-23','2021-06-24','2021-06-28','2021-07-26','2021-06-27','2021-06-28','2021-08-29'],
'Value': [20,21,23,45,12,34,17,34]})
df = pd.merge(df_1, df_2, on='Category', how='inner')
# Removed all the dates which are less than Active date
df = df.loc[(df['Active date'] <= df['Expiry date'])]
df = df.rename(columns={'Expiry date': 'Next Expiry Date'})
df = df.loc[df['Next Expiry Date'] == df.groupby('Category')['Next Expiry Date'].transform('min')]
Output:
Category Active date Next Expiry Date Value
1 A 2021-06-20 2021-06-23 21
5 B 2021-06-25 2021-06-27 34
Answered By - dporth
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.