Issue
I have a tennis dataset that looks like the following:
tourney_id = ['French Open 2018','French Open 2018','Wimbledon 2018','Wimbledon 2018','Australian Open 2019','Australian Open 2019','US Open 2019','US Open 2019']
player_name = ['Novak Djokovic','Roger Federer','Andy Murray','Rafael Nadal','John Isner','Novak Djokovic','Andy Murray','Roger Federer']
match_num = [103, 103, 217, 217, 104, 104, 243, 243]
df = pd.DataFrame(list(zip(tourney_id, player_name, match_num)),
columns =['TournamentID','Name','MatchID'])
I want to create a dictionary where the keys are the players and the items are also the players (opponents). So it would look like the following based on my dataset:
{'Novak Djokovic': ['Roger Federer','John Isner'],
'Roger Federer': ['Novak Djokovic','Andy Murray'],
'Andy Murray': ['Rafael Nadal','Roger Federer'],
'Rafael Nadal': ['Andy Murray'],
'John Isner': ['Novak Djokovic']}
I want to identify players who have played each other when they have the same values for both TournamentID and MatchID.
The last thing I tried was: df.set_index(['TournamentID','MatchID'])['Name'].to_dict()
but that's not quite what I'm looking for.
Can someone please help point me in the right direction?
Thank you!
Solution
Here's how you can do it:
import pandas as pd
tourney_id = ['French Open 2018','French Open 2018','Wimbledon 2018','Wimbledon 2018','Australian Open 2019','Australian Open 2019','US Open 2019','US Open 2019']
player_name = ['Novak Djokovic','Roger Federer','Andy Murray','Rafael Nadal','John Isner','Novak Djokovic','Andy Murray','Roger Federer']
match_num = [103, 103, 217, 217, 104, 104, 243, 243]
df = pd.DataFrame(list(zip(tourney_id, player_name, match_num)),
columns=['TournamentID', 'Name', 'MatchID'])
# Create a DataFrame with pairs of players in each match
df_pairs = pd.merge(df, df, how='inner', on=['TournamentID', 'MatchID'])
df_pairs = df_pairs[df_pairs['Name_x'] != df_pairs['Name_y']] # Remove rows where a player is paired with themselves
# Group by each player and aggregate the opponents into a list
opponents_dict = df_pairs.groupby('Name_x')['Name_y'].agg(list).to_dict()
print(opponents_dict)
Output:
{
"Andy Murray": [
"Rafael Nadal",
"Roger Federer"
],
"John Isner": [
"Novak Djokovic"
],
"Novak Djokovic": [
"Roger Federer",
"John Isner"
],
"Rafael Nadal": [
"Andy Murray"
],
"Roger Federer": [
"Novak Djokovic",
"Andy Murray"
]
}
Answered By - Hamid Rasti
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.