Issue
I am trying to generate a ROC curve based on predictions from a classifier using the two best performing features in the data set. I am encountering a ValueError: multiclass format is not supported.
This the code that the error is coming from, particularly the line 3rd from the bottom.
y_score = best_clf.decision_function(X_test[:,[best_f1, best_f2]])
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(nclasses):
fpr[i], tpr[i], _ = roc_curve(y_test, y_score[:, i])
roc_auc[i] = auc(fpr[i], tpr[i])
print("AUC:", roc_auc[i])
The f1 and f2 scores are claculated here:
best_mean_f1 = 0
best_f1 = 0
best_f2 = 0
for f1 in range(0,13):
for f2 in range(0, 13):
# Test the feature combinations here...
if f1 == f2:
continue
features_idx_to_use = [f1, f2]
clf = SGDClassifier(alpha = 0.001, max_iter = 100, random_state = 42)
clf.fit(X_train[:,[f1, f2]], y_train)
y_predicted = cross_val_predict(clf, X_train[:, features_idx_to_use], y_train, cv = 3)
conf_mat_train = confusion_matrix(y_train, y_predicted)
print("CV Train:",f1,":",f2," - ", recall_score(y_train, y_predicted, average = None))
print("CV Train:",f1,":",f2," - ", precision_score(y_train, y_predicted, average = None))
print("CV Train:",f1,":",f2," - ", f1_score(y_train, y_predicted, average = None))
current_f1 = np.mean(f1_score(y_train, y_predicted, average = None))
if current_f1> best_mean_f1:
best_f1 = f1
best_f2 = f2
best_mean_f1 = current_f1
best_clf = clf
I havent been able to find anything that helps me solve this so far. Any help would be appreciated.
Solution
If this is truly multiclass, you're likely looking for sklearn.preprocessing.label_binarize()
:
fpr[i], tpr[i], _ = roc_curve(label_binarize(y_test, classes=range(nclasses))[:,i], y_score[:,i])
Answered By - dx2-66
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.