Issue
I have two dataframes, df1 and df2, example:
df1
:
Date ID Value
1/2/2013 001 1
2/5/2013 002 15
3/4/2013 001 2
1/1/2014 005 17
2/5/2014 004 1
7/1/2016 002 2
7/1/2016 001 4
8/1/2016 007 4
df2
:
Year ID
2013 001
2014 005
2014 004
2016 001
I want to get a dataframe where I remove the rows from df1 that has Year and ID combo in df2. For example, there is 2013 and 001 in df2, I would like to remove all values from df1 that has a date in 2013 and an ID of 001.
The resulting df3 would be:
df3
:
Date ID Value
2/5/2013 002 15
7/1/2016 002 2
8/1/2016 007 4
Solution
You can use a merge
with indicator=True
after Extracting the year from Date
:
out = (df1
.merge(df2, left_on=[pd.to_datetime(df1['Date']).dt.year, 'ID'],
right_on=['Year', 'ID'], how='left',
suffixes=(None, '_'), indicator=True)
.query('_merge == "left_only"')[df1.columns]
)
Output:
Date ID Value
1 2/5/2013 002 15
5 7/1/2016 002 2
7 8/1/2016 007 4
Intermediate without .query('_merge == "left_only"')[df1.columns]
:
Date ID Value Year _merge
0 1/2/2013 001 1 2013 both
1 2/5/2013 002 15 2013 left_only
2 3/4/2013 001 2 2013 both
3 1/1/2014 005 17 2014 both
4 2/5/2014 004 1 2014 both
5 7/1/2016 002 2 2016 left_only
6 7/1/2016 001 4 2016 both
7 8/1/2016 007 4 2016 left_only
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.