Issue
I am trying to evaluate my model and I have set the scoring to be neg_root_mean_squared_error
the results are negative as expected (see here). Yet, I am used to a positive value for the RMSE (the lower the better), so is it correct if I say the RMSE of the model is +0.0725, or am I missing something?
crossvalidation_Decision_Trees = KFold(n_splits=4, random_state=0,shuffle=True)
model2 = new_model.fit(X_normalized, y_for_normalized)
scores_D_Trees = cross_val_score(model2, X_normalized,y_for_normalized, scoring='neg_root_mean_squared_error', cv=crossvalidation_Decision_Trees,
n_jobs=1)
print("\n\nDecision Trees"+": RMSE for every fold: " + str(scores_D_Trees))
print('\033[1m'+"Decision Trees"+'\033[1m'+": Average RMSE for all the folds: " + str(np.mean(scores_D_Trees)) + '\033[0m'+ ", STD: " + str(np.std(scores_D_Trees)))
Results:
Decision Trees: RMSE for every fold: [-0.0413202 -0.08435709 -0.08474064 -0.07967769]
Decision Trees: Average RMSE for all the folds: -0.07252390274931717, STD: 0.01812540303759248
Solution
For sklearn model selection routines, the greater the score is, the better. For MSE and similar metrics it's the other way round, so those are used with a negative sign.
Standard scorers utilize make_scorer()
(see sklearn/metrics/_scorer.py
):
neg_root_mean_squared_error_scorer = make_scorer(
mean_squared_error, greater_is_better=False, squared=False
)
...
sign = 1 if greater_is_better else -1
...
return self._sign * self._score_func
So yeah, nothing but multiplying the result of mean_squared_error(squared=False)
by -1
happens there.
Answered By - dx2-66
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.