Issue
I'm working on a multilabel classification problem using a svm classifier
in python
. After training and I want to test and get the samples that the algorithm is least confident, i.e., the samples that are closer to the decision boundary. I can do this with the sklearn decision_function(X) which predicts confidence scores for samples. However, how to determine which one is the closest to the decision boundary? the ones with the lower values?
My code is below so code:
clf=OneVsRestClassifier(svc)
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)
df=clf.decision_function(X_test)
print(df[0])
print(y_pred[0])
I get the following output:
[ 0.77338405 0.65244097 -0.73863779 -0.59712787 -0.78753861
-0.91293626 0.0031544 ]
[1 1 0 0 0 0 1]
In this case, which of the classes the algorithm is least certain of? -0.59712787 and 0.0031544?
Solution
Yes. The values that are close to zero are nearest to the decision boundary. If it is negative, it is in left of the decision boundary, similarly if it positive it is right of the decision boundary.
Answered By - der Fotik
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.