Saturday, December 30, 2023

[FIXED] Calculate % in Pandas pivot_table?

December 30, 2023 dataframe, pandas, python No comments

Issue

I have a table with data for who viewed a page and who clicked on it. The following code gets me the pivot table below:

# users who clicked, by hour and weekday
pct = df.pivot_table(columns=['weekday'],index=['hour'], values=['users_who_clicked','users_who_viewed'], aggfunc= sum, fill_value=0, margins=True)
pct

pivot table

I've been googling and searching here for a solid hour but I can't seem to find how to just display the rate: i.e. 'users_who_clicked' divided by 'users_who_viewed'. I don't need to see the other two series.

I feel like if I just create a new column across the whole data set dividing the two it won't aggregate correctly, so the aggregation needs to happen within the table itself. It's easy in Excel, just use a calculated field, so I imagine it must be fairly straightforward in pandas as well. Any thoughts?

Thanks!

p.s. Tried using a lambda function but got a "TypeError: unsupported operand type(s) for +: 'int' and 'str'" error. Tried putting the two tables directly in aggfunc and no luck.

Solution

Use DataFrame.xs for select by MultiIndex, rename first level and divide, last append to original DataFrame:

d = {('users_who_clicked','a'): [5, 6, 7, 8], 
     ('users_who_clicked','b'): [9, 10, 11, 12], 
     ('users_who_viewed','a'): [4, 1, 3, 7], 
     ('users_who_viewed','b'): [1, 3, 7, 3]}

df = pd.DataFrame(data=d)
  
df1 = (df.xs('users_who_clicked', axis=1, level=0, drop_level=False)
         .rename(columns={'users_who_clicked':'%'}, level=0))
df2 = (df.xs('users_who_viewed', axis=1, level=0, drop_level=False)
         .rename(columns={'users_who_viewed':'%'}, level=0))

out = pd.concat([df, df1.div(df2)], axis=1)  
print (out)
  users_who_clicked     users_who_viewed            %          
                  a   b                a  b         a         b
0                 5   9                4  1  1.250000  9.000000
1                 6  10                1  3  6.000000  3.333333
2                 7  11                3  7  2.333333  1.571429
3                 8  12                7  3  1.142857  4.000000

Answered By - jezrael

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, December 30, 2023

[FIXED] Calculate % in Pandas pivot_table?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels