Issue
I have a dataset like the following:
import pandas as pd
dict1 = {
"idx": [1, 2, 9, 1, 1, 6, 1, 3, 2],
"value": ["a1", "b1", "c1", "t1", "t1", "f1", "r1", "l1", "b1"]
}
df = pd.DataFrame(dict1)
Here, a1
, t1
, r1
always occur with 1
. To verify this, I used the following code,
def one_to_one(df, col1, col2):
first = df.groupby(col1)[col2].count().max()
second = df.groupby(col2)[col1].count().max()
return first + second == 2
one_to_one(df, 'idx', 'value')
But it returned None
even though a1
, t1
, r1
always occur with 1
and others do the same. How to verify that, a1
for example, only occurs with idx
= 1
?
Solution
You can just drop_duplicates
based idx
+ value
combination. If each value
has only one idx
, then the result should contain the same number of records as the number of unique values in value
column:
len(df[['idx', 'value']].drop_duplicates()) == df['value'].nunique()
# True
Answered By - Psidom
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.