Issue
How to remove these duplicate Team name which is appearing in both Team1 and Team2, Like Australia appearing in both Match 0,1 and opponent of Australia is same also the result in Winner.
Match Team1 Team2 Winner
0 Australia England Australia
1 England Australia Australia
2 India Australia Australia
3 England India England
Solution
You can use np.sort
to sort the columns on axis=1
and then check with df.duplicated()
, and use ~
to turn False
to True
so only non duplicated rows are True
, and use as boolean mask.:
m=pd.DataFrame(np.sort(df[['Team1','Team2','Winner']],axis=1)).duplicated()
df[~m]
Match Team1 Team2 Winner
0 0 Australia England Australia
2 2 India Australia Australia
3 3 England India England
Answered By - anky
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.