Issue
I have a filtered dataframe called correct_df
and a raw dataframe example_df
.
example_df = pd.DataFrame({'Test': ['Test_1', 'Test_1', 'Test_1', 'Test_2', 'Test_2', 'Test_2', 'Test_3', 'Test_3', 'Test_3'], 'A': [1, 2, 3, 1, 2, 3, 1, 2, 3]})
other_df = pd.DataFrame({'Test': ['Test_1', 'Test_1', 'Test_3', 'Test_3'], 'A': [1, 2, 1, 3]})
Desired result:
I want the indexes of example_df
where 'Test' and 'A' column values both match the combined row value of correct_df
.
I have tried:
result = example_df.loc[ (example_df['Test'].isin(other_df['Test'])) & (example_df['A'].isin(other_df['A'])) ]
result
Test A
0 Test_1 1
1 Test_1 2
2 Test_1 3
6 Test_3 1
7 Test_3 2
8 Test_3 3
But as these two conditions are separated, the resulting value only applies the conditions on single columns without chaining them i.e. A then also B, not A and B. How do I get the .loc of both column conditions?
Solution
Use DataFrame.reset_index
for avoid lost indices and then DataFrame.merge
:
result = example_df.reset_index().merge(other_df, on=['Test','A'])
print (result)
index Test A
0 0 Test_1 1
1 1 Test_1 2
2 6 Test_3 1
3 8 Test_3 3
result = (example_df.reset_index()
.merge(other_df, on=['Test','A'])
.set_index('index')
.rename_axis(None))
print (result)
Test A
0 Test_1 1
1 Test_1 2
6 Test_3 1
8 Test_3 3
Another idea with MultiIndex
with Index.isin
and filtering in boolean indexing
:
result = example_df[example_df.set_index(['Test','A']).index
.isin(other_df.set_index(['Test','A']).index)]
print (result)
Test A
0 Test_1 1
1 Test_1 2
6 Test_3 1
8 Test_3 3
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.