Issue
I have a dataframe, something like this.
data = {'label':['y', 'x', 'z', 'y', 'z', 'x' ],
'x_score': [0.35, 0.7, 0.05, 0.12, 0.2, 0.9],
'y_score': [0.6, 0.2, 0.45, 0.58, 0.3, 0.05],
'z_score': [0.05, 0.1, 0.5, 0.3, 0.5, 0.05]}
df = pd.DataFrame(data)
Three operations will be performed on the dataframe based on the column label and will store the result in a separate column say, result
- If the label is x then simply x_score will be stored in the result column.
- If the label is z then -1×(z_score) or negative of the z_score will be stored in the result column.
- If the label is y then (x_score - y_score) will be stored in the result column.
The output should look like this,
label x_score y_score z_score result
0 y 0.35 0.60 0.05 0.30
1 x 0.70 0.20 0.10 0.70
2 z 0.05 0.45 0.50 -0.50
3 y 0.12 0.58 0.30 -0.18
4 z 0.20 0.30 0.50 -0.50
5 x 0.90 0.05 0.05 0.90
Please help me with this.
Solution
You can use np.select
:
label = df['label']
condlist = [label.eq('x'), label.eq('z'), label.eq('y')]
choicelist = [df['x_score'], - df['z_score'], df['x_score'] - df['y_score']]
df['result'] = np.select(condlist, choicelist, default=np.nan)
label x_score y_score z_score result
0 y 0.35 0.60 0.05 -0.25
1 x 0.70 0.20 0.10 0.70
2 z 0.05 0.45 0.50 -0.50
3 y 0.12 0.58 0.30 -0.46
4 z 0.20 0.30 0.50 -0.50
5 x 0.90 0.05 0.05 0.90
Answered By - Vladimir Fokow
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.