Issue
I have the following dataframe
name labels
0 test ['a','b']
1 test1 ['a','c']
2 test2 ['a','d']
I have array of all labels and I want my dataframe to look like this:
name a b c d
0 test 1 1 0 0
1 test1 1 0 1 0
2 test2 1 0 0 1
Solution
You can use str.get_dummies
:
df = df.join(df.pop('labels').str.join('|').str.get_dummies())
Or, if you don't want to modify in place:
df2 = (df.drop(columns='labels')
.join(df['labels'].str.join('|').str.get_dummies())
)
output:
name a b c d
0 test 1 1 0 0
1 test1 1 0 1 0
2 test2 1 0 0 1
Used input:
df = pd.DataFrame({'name': ['test', 'test1', 'test2'],
'labels': [['a', 'b'], ['a', 'c'], ['a', 'd']]})
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.