Issue
I have a Dataframe that has duplicate Name values. But, I only want to keep the duplicate value that has a Team value set to "TOT":
Name Team Games
Trevor SAC 32
Trevor TOT 50
Trevor POR 18
Kyle MEM 59
LeMarcus SAS 43
Jordan TOT 50
Jordan MIN 35
Jordan ATL 15
Will DEN 53
How do I delete a duplicate value in one column based on a string value in another column?
I would want an output like this:
Name Team Games
Trevor TOT 50
Kyle MEM 59
LeMarcus SAS 43
Jordan TOT 50
Will DEN 53
Solution
You do use duplicated:
df.loc[~(df.Name.duplicated(keep=False) & df.Team.ne('TOT'))]
Or you can use a groupby and then filter.
(
df.groupby('Name',sort=False)
.apply(lambda x: x if len(x)==1 else x.loc[x.Team.eq('TOT')])
.reset_index(drop=True)
)
Name Team Games
0 Trevor TOT 50
1 Kyle MEM 59
2 LeMarcus SAS 43
3 Jordan TOT 50
4 Will DEN 53
Answered By - Allen
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.