Issue
As we might know, when using solver='liblinear'
on multiclass classification problem, logistic regression will use one-vs-rest strategy. Does that mean there should be n_classes
number of binary classifiers/estimators? If so, how can I access those?
I have read the documentation, but could not find any way to do this.
Solution
It looks like there is no easy way to access those sub models. However you can recompute these sub models using model.coef_
and model.intercept_
.
As follows:
from sklearn.linear_model import LogisticRegression
from sklearn import svm, datasets
import numpy as np
X_train, y_train = datasets.load_iris(return_X_y=True)
model = LogisticRegression(
penalty="l1",
multi_class="ovr",
class_weight="balanced",
solver="liblinear",
)
model.fit(X_train, y_train)
n_labels = len(np.unique(y_train))
for i in range(n_labels):
sub_model = LogisticRegression(penalty=model.penalty, C=model.C)
sub_model.coef_ = model.coef_[i].reshape(1, -1)
sub_model.intercept_ = model.intercept_[i].reshape(-1, 1)
sub_model.classes_ = np.array([0, 1])
y_train_ovr = np.where(y_train == i, 1, 0)
score = sub_model.score(X_train, y_train_ovr)
print(f"OVR for label={i}, score={score:.4f}")
Output:
OVR for label=0, score=1.0000
OVR for label=1, score=0.7333
OVR for label=2, score=0.9667
This code is basically creating a new LogisticRegression()
for each label based on the original model coefficients, intercepts, C and penality. Finally the y_train labels are encoded in order to represent this OVR
task.
Answered By - Antoine Dubuis
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.