Issue
How can I generate a crossed table from the following dataframe:
import pandas as pd
dat = pd.read_csv('data.txt', sep=',')
dat.head(6)
Factor1 Factor2
0 A X
1 B X
2 A X|Y
3 B X|Y
4 A X|Y|Z
5 B X|Y|Z
dat[['Factor2']] = dat[['Factor2']].applymap(lambda x : x.split('|'))
dat.head(6)
Factor1 Factor2
0 A [X]
1 B [X]
2 A [X, Y]
3 B [X, Y]
4 A [X, Y, Z]
5 B [X, Y, Z]
The resulting pd.crosstab()
should look like this:
X Y Z
A 3 2 1
B 3 2 1
Solution
We can use get_dummies
to convert the Feature2
column to indicator variables, then group the indicator variables by Feature1
and aggregate with sum
df['Factor2'].str.get_dummies('|').groupby(df['Factor1']).sum()
X Y Z
Factor1
A 3 2 1
B 3 2 1
Answered By - Shubham Sharma
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.