Issue
I have subsetted a dataframe whose index is present as values (strings) in another dataframe as follows:
df = df1[df1.index.isin(df2['column_name'])]
this works without issue however the order of the index in the df is different to that of df2['column_name']
this is understandable and also fine as i dont care for the order of the new df. however as a sanity check I would like to be sure that the new dataframe indexes exactly match those of the column names in df2 (again, not order but just that the subsetting works correctly)
unfortunately, df.index.equals(df2['column_name')
returns False as it expects the order to also be the same.
Is there a way of checking that values match without worrying about the order?
reproducible example:
df1 = pd.DataFrame(np.array([1,2,3,4,5,6]),index=['a', 'b', 'c', 'd', 'e', 'f'], columns=['values'])
df2 = pd.DataFrame(np.array(['a', 'b', 'c']), index=range(3), columns=['column_name'])
df = df1[df1.index.isin(df2['column_name'])]
thank you
Solution
Test values for subsets - without ordering is possible by:
print (set(df2['column_name']).issubset(df1.index))
True
print (df2['column_name'].isin(df1.index).all())
True
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.