Sunday, September 4, 2022

[FIXED] ROC_Curve multiclass format is not supported

September 04, 2022 python, roc, scikit-learn No comments

Issue

I am trying to generate a ROC curve based on predictions from a classifier using the two best performing features in the data set. I am encountering a ValueError: multiclass format is not supported.

This the code that the error is coming from, particularly the line 3rd from the bottom.

y_score = best_clf.decision_function(X_test[:,[best_f1, best_f2]])
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(nclasses):
    fpr[i], tpr[i], _ = roc_curve(y_test, y_score[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
    print("AUC:", roc_auc[i])

The f1 and f2 scores are claculated here:

best_mean_f1 = 0
best_f1 = 0
best_f2 = 0

for f1 in range(0,13):
    for f2 in range(0, 13):
        # Test the feature combinations here...
        
        if f1 == f2:
            continue
        
        features_idx_to_use = [f1, f2]
        
        clf = SGDClassifier(alpha = 0.001, max_iter = 100, random_state = 42)
        clf.fit(X_train[:,[f1, f2]], y_train)
        
        y_predicted = cross_val_predict(clf, X_train[:, features_idx_to_use], y_train, cv = 3)
            
        conf_mat_train = confusion_matrix(y_train, y_predicted)
              
        
        print("CV Train:",f1,":",f2," - ", recall_score(y_train, y_predicted, average = None))
        print("CV Train:",f1,":",f2," - ", precision_score(y_train, y_predicted, average = None))
        print("CV Train:",f1,":",f2," - ", f1_score(y_train, y_predicted, average = None))
        
        current_f1 = np.mean(f1_score(y_train, y_predicted, average = None))
        if current_f1> best_mean_f1:
            best_f1 = f1
            best_f2 = f2
            best_mean_f1 = current_f1
            best_clf = clf

I havent been able to find anything that helps me solve this so far. Any help would be appreciated.

Solution

If this is truly multiclass, you're likely looking for sklearn.preprocessing.label_binarize():

fpr[i], tpr[i], _ = roc_curve(label_binarize(y_test, classes=range(nclasses))[:,i], y_score[:,i])

Answered By - dx2-66

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, September 4, 2022

[FIXED] ROC_Curve multiclass format is not supported

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels