Issue
I am trying to add a row to the bottom of a grouped by data frame in Pandas, using following code
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('manufacturers.csv')
gkk = df.groupby(['Country'])['Country'].count().reset_index(name='byCountry')
gkk.loc['Total by Country']= gkk['byCountry'].sum()
print(gkk)
but I do not want to display the total number under the Country column. Instead I want to skip the first column and output it like:
How can I utilize this code gkk.loc['Total by Country']= gkk['byCountry'].sum()
to achieve this?
Solution
You reset_index
, turning your Series into a DataFrame.
One option is to do this step only after you added the wanted row:
gkk = df.groupby(['Country'])['Country'].count()
gkk.loc['Total by Country'] = gkk.sum()
gkk = gkk.reset_index(name='byCountry')
Or keep the reset_index
and assign with a dictionary:
gkk = df.groupby(['Country'])['Country'].count().reset_index(name='byCountry')
gkk.loc[''] = {'County': 'Total by Country',
'byCountry': gkk['byCountry'].sum()}
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.