Issue
I've encountered a problem which I didn't succeed to solve, for now. Your assistance on this one would be highly appreciated.
I have 2 dataframes:
first_df:
A B C D
1 1 a q zz
2 2 b w xx
3 3 c e yy
4 4 d r vv
second_df:
C1 C2
1 10 a
2 20 b
3 70 g
4 80 h
What I want to achieve is mark rows from first_df, based on values from second_df. Marking those rows should be based on comparing column values:
(first_df['A'] == second_df['C1'] * 10) & (first_df['B'] == second_df['C2'])
Expected output should be like this:
compared_df:
A B C D Match
1 1 a q zz True
2 2 b w xx True
3 3 c e yy False
4 4 d r vv False
Could you please point me in a right direction? Which pandas methods should I use to achieve my goal? I've tried many things, but I'm pandas beginner, so it's hard to assess if those attempts were correct.
Solution
First create Multiindex on both the dataframes then use MultiIndex.isin
to test for the occurrence of the index values of first dataframe in the index of second dataframe in order the create boolean flag:
i1 = first_df.set_index([first_df['A'] * 10, 'B']).index
i2 = second_df.set_index(['C1', 'C2']).index
first_df['Match'] = i1.isin(i2)
Result
print(first_df)
A B C D Match
1 1 a q zz True
2 2 b w xx True
3 3 c e yy False
4 4 d r vv False
Answered By - Shubham Sharma
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.