Issue
In the sklearn LogisticRegression classifer, we can set the muti_class
option to ovr
which stands for one-vs-rest, as in the following code snippet:
# logistic regression for multi-class classification using built-in one-vs-rest
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
# define dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)
# define model
model = LogisticRegression(multi_class='ovr')
# fit model
model.fit(X, y)
Now, this classifier can assign probabilities to different classes for given instances:
# make predictions
yhat = model.predict_proba(X)
The probabilities sum to 1 for each instance:
array([[0.16973178, 0.46755188, 0.36271634],
[0.58228627, 0.0928127 , 0.32490103],
[0.28241256, 0.51175978, 0.20582766],
...,
[0.17922774, 0.71300755, 0.10776471],
[0.05888508, 0.24924809, 0.69186683],
[0.25808835, 0.68599321, 0.05591844]])
My question: In the one-vs-rest method, a classifier is trained for each class. Therefore, we expect a probability for each class independent from other classes. How the probabilities are normalized to sum to 1?
Solution
As you can see here, multiclass is handled by normalizing the score of each class for the instance x over all classes as follows: the estimated probability that the instance belongs to class k is given by
f representing the decision function, K the number of classes.
Answered By - amiola
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.