Issue
I have a LightGBMclassifier and want to plot its features importance, for that, I used the line bellow :
import lightgbm as lgb
lgb.plot_importance(model, figsize=(8,6))
but I get this error :
TypeError: booster must be Booster or LGBMModel.
can someone help please ? I do not get this error.
and here's how I built the model :
lgbmc_classifier = LGBMClassifier(
boosting_type='goss',
colsample_bytree=0.17,
lambda_l1=69.12,
lambda_l2=0.0001,
learning_rate=0.09,
max_bin=512,
min_child_samples=3,
n_estimators=7782,
num_leaves=10,
subsample=0.26,
random_state=356,
importance_type='gain',
)
model = Pipeline([
("preprocessor", preprocessor),
("standardizer", standardizer),
("classifier", lgbmc_classifier),
])
#model
model.fit(X_train, y_train)
##################################################
for data processing, on the model pipeline I also use :
for col in ["Time_Taken_hours", "Date"]:
timestamp_transformer = TimestampTransformer()
ohe_transformer = ColumnTransformer(
[("ohe", OneHotEncoder(sparse=False, handle_unknown="ignore"), [timestamp_transformer.HOUR_COLUMN_INDEX])],
remainder="passthrough")
timestamp_preprocessor = Pipeline([
("extractor", timestamp_transformer),
("onehot_encoder", ohe_transformer)
])
transformers.append((f"timestamp_{col}", timestamp_preprocessor, [col]))
from sklearn.compose import ColumnTransformer
preprocessor = ColumnTransformer(transformers, remainder="passthrough", sparse_threshold=0)
###################################################
standardizer = StandardScaler()
Solution
According to your error, model
is not of the type that lgb.plot_importance
expects, namely Booster
or LGBMModel
.
This is kind of clear, since you define model
by
model = Pipeline([
("preprocessor", preprocessor),
("standardizer", standardizer),
("classifier", lgbmc_classifier),
])
The Pipeline
function returns a class Pipeline
instead of Booster
or LGBMModel
Pipeline.
But you need LGBMModel
or Booster
. So how do you get that? Well, you gave it the name classifier
, hence you can access it using model["classifier"]
. This is what you can use:
import lightgbm as lgb
from sklearn.pipeline import Pipeline
from sklearn.datasets import load_iris
X_train, y_train = load_iris(return_X_y=True)
lgbmc_classifier = lgb.LGBMClassifier(
boosting_type='goss',
learning_rate=0.01,
min_child_samples=10,
n_estimators=100,
num_leaves=10,
random_state=1,
importance_type='gain',
)
lgbmc_classifier.fit(X_train, y_train)
model = Pipeline([
("classifier", lgbmc_classifier)
])
lgb.plot_importance(model['classifier'], figsize=(4,2))
returns
Note that I have tweaked your regularization parameters a bit in order to be able to gain some results on the iris dataset. Also, I removed the two other Pipeline elements as I don't know how you use, them, but for the result that doesn't matter in this case.
You could also just get the data, and match the data with the feature columns, e.g. using plotly:
import plotly.express as px
importances = pd.DataFrame({"feature_names": X_train.columns, "values": lgbmc_classifier.feature_importances_})
fig = px.bar(importances, x="feature_names", y="values" , width=400, height=400)
fig.show()
Or maybe you wanted a pie chart instead
fig = px.pie(importances, names="feature_names", values="values" , width=400, height=400)
fig.show()
Answered By - tavdp
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.