Issue
Exploring some classification models in Scikit learn I noticed that the scores I got for log loss and for ROC AUC were consistently lower while performing cross validation than while fitting and predicting on the whole training set (done to check for overfitting), thing that did not make sense to me.
Specifically, using cross_validate
I set the scorings as ['neg_log_loss', 'roc_auc']
and while performing manual fitting and prediction on the training set I used the metric functions log_loss'
and roc_auc_score
.
To try to figure out what was happening, i wrote a code to perform the cross validation manually in order to be able to call the metric functions manually on the various folds and compare the results with the ones from cross_validate
. As you can see below, I got different results even like this!
from sklearn.model_selection import StratifiedKFold
kf = KFold(n_splits=3, random_state=42, shuffle=True)
log_reg = LogisticRegression(max_iter=1000)
for train_index, test_index in kf.split(dataset, dataset_labels):
X_train, X_test = dataset[train_index], dataset[test_index]
y_train, y_test = dataset_labels_np[train_index], dataset_labels_np[test_index]
log_reg.fit(X_train, y_train)
pr = log_reg.predict(X_test)
ll = log_loss(y_test, pr)
print(ll)
from sklearn.model_selection import cross_val_score
cv_ll = cross_val_score(log_reg, dataset_prepared_stand, dataset_labels, scoring='neg_log_loss',
cv=KFold(n_splits=3, random_state=42, shuffle=True))
print(abs(cv_ll))
Outputs:
4.795481869275026
4.560119170517534
5.589818973403791
[0.409817 0.32309 0.398375]
The output running the same code for ROC AUC are:
0.8609669592272686
0.8678563239907938
0.8367147503682851
[0.925635 0.94032 0.910885]
To be sure to have written the code right, I also tried the code using 'accuracy'
as scoring for cross validation and accuracy_score
as metric function and the results are instead consistent:
0.8611584327086882
0.8679727427597955
0.838160136286201
[0.861158 0.867973 0.83816 ]
Can someone explain me why the results in the case of the log loss and the ROC AUC are different? Thanks!
Solution
Log-loss and auROC both need probability predictions, not the hard class predictions. So change
pr = log_reg.predict(X_test)
to
pr = log_reg.predict_proba(X_test)[:, 1]
(the subscripting is to grab the probabilities for the positive class, and assumes you're doing binary classification).
Answered By - Ben Reiniger
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.