Issue
I have the below case statement in python,
pd_df['difficulty'] = 'Unknown'
pd_df['difficulty'][(pd_df['Time']<30) & (pd_df['Time']>0)] = 'Easy'
pd_df['difficulty'][(pd_df['Time']>=30) & (pd_df['Time']<=60)] = 'Medium'
pd_df['difficulty'][pd_df['Time']>60] = 'Hard'
But when I run the code, it throws an error.
A value is trying to be set on a copy of a slice from a DataFrame
Solution
Option 1
For performance, use a nested np.where
condition. For the condition, you can just use pd.Series.between
, and the default value will be inserted accordingly.
pd_df['difficulty'] = np.where(
pd_df['Time'].between(0, 30, inclusive=False),
'Easy',
np.where(
pd_df['Time'].between(0, 30, inclusive=False), 'Medium', 'Unknown'
)
)
Option 2
Similarly, using np.select
, this gives more room for adding conditions:
pd_df['difficulty'] = np.select(
[
pd_df['Time'].between(0, 30, inclusive=False),
pd_df['Time'].between(30, 60, inclusive=True)
],
[
'Easy',
'Medium'
],
default='Unknown'
)
Option 3
Another performant solution involves loc
:
pd_df['difficulty'] = 'Unknown'
pd_df.loc[pd_df['Time'].between(0, 30, inclusive=False), 'difficulty'] = 'Easy'
pd_df.loc[pd_df['Time'].between(30, 60, inclusive=True), 'difficulty'] = 'Medium'
Answered By - cs95
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.