Issue
I am using cross_validate
from Sklearn and it is working fine for multiple models such as GaussianNB, RandomForestClassifier, KNeighborsClassifier, GradientBoostingClassifier and XGBClassifier but when using it with SVC it returns nan
. Here's my code and the things I've tried.
models = [('GaussianNB', GaussianNB()), ('RandomForest', RandomForestClassifier()), ('KNN', KNeighborsClassifier()),
('SVM', SVC()), ('GradientBoosting', GradientBoostingClassifier()), ('XGB', XGBClassifier(eval_metric='mlogloss'))]
scoring = ['accuracy', 'precision_weighted', 'recall_weighted', 'f1_weighted', 'roc_auc_ovr_weighted']
for name, model in models:
kfold = model_selection.KFold(n_splits=5, shuffle=True, random_state=0)
cv_results = model_selection.cross_validate(model, X_train, y_train.values, cv=kfold, scoring=scoring)
this_df = pd.DataFrame(cv_results)
this_df['model'] = name
dfs.append(this_df)
final = pd.concat(dfs, ignore_index=True)
The output is shown in the picture below, you can see values for all models but SVC.
I have tried single code like this and it outputs a value but not when in the cross_validate
>>> model_selection.cross_val_score(SVC(), X_train, y_train.values, cv=kfold, scoring='recall_weighted')
array([0.58930041, 0.59506173, 0.59060956, 0.61532125, 0.62685338])
I've tried converting to dataframe but same result.
Solution
For anyone facing the same issue you need to enable probability=True
in the model.
It seems that Sklearn carries any error for all the scores. Probability is needed for roc_auc_ovr_weighted
Answered By - David Hernandez
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.