Issue
I have a dataframe and would like to check for each row which of my conditions are true. If multiple are true I would like to return all of those choices with np.select. How can I do this?
df = pd.DataFrame({'cond1':[True, True, False, True],
'cond2':[False, False, True, True],
'cond3':[True, False, False, True],
'value': [1, 3, 3, 6]})
conditions = [df['cond1'] & (df['value']>4),
df['cond2'],
df['cond2'] & (df['value']>2),
df['cond3'] & df['cond2']]
choices = [ '1', '2', '3', '4']
df["class"] = np.select(conditions, choices, default=np.nan)
I get this
cond1 cond2 cond3 value class
0 True False True 1 nan
1 True False False 3 nan
2 False True False 3 2
3 True True True 6 1
but would like to get this
cond1 cond2 cond3 value class
0 True False True 1 nan
1 True False False 3 nan
2 False True False 3 2 and 3
3 True True True 6 1 and 2 and 3 and 4
Solution
You can use this trick with np.dot
:
df1 = pd.DataFrame(conditions, columns=df.index, index=choices).T
df['class'] = df1.dot(df1.columns + ' and ').str.strip(' and ')
Output:
>>> df
cond1 cond2 cond3 value class
0 True False True 1
1 True False False 3
2 False True False 3 2 and 3
3 True True True 6 1 and 2 and 3 and 4
Intermediate result:
>>> df1
1 2 3 4 <- columns from choices
0 False False False False
1 False False False False
2 False True True False
3 True True True True
^- index from df.index
Answered By - Corralien
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.