Issue
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
## the following is the data set
gm = pd.read_csv('https://raw.githubusercontent.com/gerberl/6G7V0026-2223/main/datasets/gapminder.tsv', sep='\t')
the command that I have been using, which counts each country multiple times.
sns.countplot(x=gm.continent)
plt.show
I can get the plot graph by making a new data frame, however there must be a way to get the graph without making a new dataframe.
The bars would be for the total number of countries in each continent, where the x-axis will be continents.
Solution
- The most direct way is to use
pandas
to get the number of unique countries for each continent, and then plot directly withpandas.DataFrame.plot
.pandas
usesmatplotlib
as the default plotting backend, andseaborn
is just an API formatplotlib
.
- This answer shows how to use
pd.DataFrame.pivot_table
to get the number of unique values for each group.gm.groupby('continent')['country'].nunique()
can also be used.
- If the link to the Gapminder data no longer works, it can also be found here.
import pandas as pd
# load the dataset
gm = pd.read_csv('https://raw.githubusercontent.com/gerberl/6G7V0026-2223/main/datasets/gapminder.tsv', sep='\t')
# create a pivot table with continent and the number of unique countires
pt = gm.pivot_table(index='continent', values='country', aggfunc='nunique')
# plot the bar cart
ax = pt.plot(kind='bar', rot=0, ylabel='Number of Countires', xlabel='Continent', legend=False)
pt
DataFrame
country
continent
Africa 52
Americas 25
Asia 33
Europe 30
Oceania 2
Answered By - Trenton McKinney
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.