Issue
I am working on a survey and the data looks like this:
ID Q1 Q2 Q3 Gender Age Dept
001 Y N Y F 22 IT
002 N Y Y M 35 HR
003 Y N N F 20 IT
004 Y N Y M 54 OPRE
005 Y N Y M 42 OPRE
The codes are:
out = df.pivot_table(index='Q1', columns=['Gender'], values=['ID'], aggfunc='count', fill_value=0)
out = (out.join(out[['ID']].div(out['ID'].sum(axis=1).values, axis=0)
.mul(100)
.rename(columns={'ID':'%Respondents'})))
out
And the table I created is like this:
Q1 #Res %Rep
M F M F
Y 2 2 50 50
N 1 0 100 0
But I'd like all the Questions results be at one execution, like
Q1 #Res %Rep
M F M F
Y 2 2 50 50
N 1 0 100 0
Q2 #Res %Rep
M F M F
Y 1 0 100 0
N 2 2 50 50
Q3 #Res %Rep
M F M F
Y 2 2 50 50
N 0 1 0 100
I want to write a function of creating the tables, and use a for loop to go over the questions. Can anyone help?
Solution
You could use this function:
def aggregate_over_questions( df ):
data = pd.DataFrame()
#get the number of questions
questions = df.loc[:,df.columns.str.contains("^Q")].columns.tolist()
for q in questions:
temp = df.pivot_table(index=f"{q}", columns=['Gender'], values['ID'], aggfunc='count', fill_value=0)
temp = (temp.join(out[['ID']].div(temp['ID'].sum(axis=1).values, axis=0)
.mul(100)
.rename(columns={'ID':'%Respondents'})))
data = pd.concat( [data, temp], axis=0 )
return data
Answered By - sushant patil
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.