Issue
I have dataframe like this. But there are about ten thousand rows.
import pandas as pd
import numpy as np
data = {'gameId': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], 'eventId': [1, 2, 3, 4, 5, 1, 2, 3, 4, 5], 'player': ['A', 'B', 'C', 'D', 'E', 'A', 'B', 'C', 'D', 'E'], 'related_eventId': [2, 1, 4, 3, np.nan, 2, 1, 4, 3, np.nan]}
So I need to create column "related_player" based on the player from row which eventId is equal related_eventId.
If I would not have column gameId I can do it by merging
result = df.merge(df[['eventId', 'player']], left_on='related_eventId', right_on='eventId', how='left', suffixes=('', '_related'))
result.rename(columns={'player_related': 'related_player', 'eventId_related': 'related_eventId'}, inplace=True)
result = result[['eventId', 'player', 'related_eventId', 'related_player']]
But output is not correct because I need to group by gameId. In R it is pretty simple, but I don't understand how to correctly do it in Python.
My expected output should be like this
gameId | eventId | player | related_eventId | related_player |
---|---|---|---|---|
1 | 1 | A | 2 | B |
1 | 2 | B | 1 | A |
1 | 3 | C | 4 | D |
1 | 4 | D | 3 | C |
1 | 5 | E | NaN | NaN |
2 | 1 | A | 2 | B |
2 | 2 | B | 1 | A |
2 | 3 | C | 4 | D |
2 | 4 | D | 3 | C |
2 | 5 | E | NaN | NaN |
Solution
Add column to list ['eventId', 'player','gameId']
and to parameters left_on
and right_on
:
result = df.merge(df[['eventId', 'player','gameId']],
left_on=['gameId','related_eventId'],
right_on=['gameId','eventId'],
how='left',
suffixes=('', '_related'))
result.rename(columns={'player_related': 'related_player',
'eventId_related': 'related_eventId'}, inplace=True)
result = result[['eventId', 'player', 'related_eventId', 'related_player']]
print (result)
eventId player related_eventId related_eventId related_player
0 1 A 2.0 2.0 B
1 2 B 1.0 1.0 A
2 3 C 4.0 4.0 D
3 4 D 3.0 3.0 C
4 5 E NaN NaN NaN
5 1 A 2.0 2.0 B
6 2 B 1.0 1.0 A
7 3 C 4.0 4.0 D
8 4 D 3.0 3.0 C
9 5 E NaN NaN NaN
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.