Issue
Take this simple code:
import lightgbm as lgb
from sklearn.inspection import permutation_importance
X_train, X_test, y_train, y_test = train_test_split(X, y)
lgbr = lgb.LGBMRegressor(n_estimators=500, learning_rate=0.15)
model_lgb = lgbr.fit(X_train, y_train)
r = permutation_importance(model_lgb, X_test, y_test, n_repeats=30, random_state=0)
for i in r.importances_mean.argsort()[::-1]:
print(f"{i} " f"{r.importances_mean[i]:.3f}" f" +/- {r.importances_std[i]:.3f}")
When I run this on my dataset the top value is about 1.20
.
But I thought that the permutation_importance mean for a feature was the amount that the score was changed on average by permuting the feature column so this can't be more than 1 can it?
What am I missing?
(I get the same issue if I replace lightgbm by xgboost so I don't think it is a feature of the particular regression method.)
Solution
But I thought that the permutation_importance mean for a feature was the amount that the score was changed on average by permuting the feature column[...]
Correct.
so this can't be more than 1 can it?
That depends on whether the score can "worsen" by more than 1. The default for the scoring
parameter of permutation_importance
is None
, which uses the model's score
function. For LGBMRegressor
(and most regressors), that's the R2 score, which has a maximum of 1 but can take arbitrarily large negative values, so indeed the score can worsen by an arbitrarily large amount.
Answered By - Ben Reiniger
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.