Issue
I am creating my own projects in my portfolio. Currently, I analyse Football Transfer. I would like to filter TOP5 Leagues in the world, then do an analysis and visualise my data.
Firstly, I prepared lists with these leagues and countries:
League = ["Bundesliga","Premier League","LaLiga","Serie A","Ligue 1"]
Country = ["Germany" , "England", "Spain" , "Italy", "France"]
Then, I passed these lists to my data frame by using isin()
in order to see new filtered DataFrame:
Leagues = df_transfer_players[df_transfer_players['League Destination'].isin(League)]
Countries = df_transfer_players[df_transfer_players['Country Destination'].isin(Country)]
Lastly, I checked each assigned variables whether it works or not. Luckily, It works:
Countries League Finally, I would like to merge these two into one DataFrame in order to groupby it and make analysis and visualisations.
According to this website https://www.geeksforgeeks.org/python-pandas-dataframe-isin/, I try to do it like this:
df_transfer_players[Leagues&Countries]
Unfortunately, this output was displayed:
TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
I tried to solve this problem. I browsed a lot of articles and I am not able to seek any advice how to tackle it.
Would you be so kind as to give me some advice in order to obtain data with both filter applied and mandatory?
Solution
The bitwise & operator works on an element-wise basis on boolean arrays. Since Leagues
and Countries
are DataFrames comprised of non-boolean type columns (not boolean arrays or Series), python can't evaluate the expression: Leagues & Countries
. If you convert each DataFrame to a boolean type, then you can use a bitwise operator on it. So
Leagues.astype(bool) & Countries.astype(bool)`
works. But if you want to filter df_transfer_players
for rows that satisfy both conditions, you can use a boolean mask as:
msk = df_transfer_players['League Destination'].isin(League) & df_transfer_players['Country Destination'].isin(Country)
out = df_transfer_players[msk]
Here, both the left and right hand side of the & operator is a boolean Series (created by isin
) so you can perform element-wise operation. This filters rows where the League Destination is one of the leagues in League
and the Country Destination is one the countries in Country
.
That being said, considering Bundesliga is in Germany, Premier League is in England, etc., Leagues
is entirely contained in Countries
, so I think there's no need for any concatenation here. Simply use Leagues
DataFrame for your analysis. Note that Countries
DataFrame also includes data from the Austrian Bundesliga, so they are not exactly the same.
Answered By - enke
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.