Tuesday, February 1, 2022

TypeError: Unknown type of parameter:boosting_type, got:dict

February 01, 2022 lightgbm, machine-learning, nlp, python, scikit-learn No comments

Issue

Im trying to train a lightGBM model on a dataset consisting of numerical, Categorical and Textual data. However, during the training phase, i get the following error:

params = {
'num_class':5,
'max_depth':8,
'num_leaves':200,
'learning_rate': 0.05,
'n_estimators':500
}

clf = LGBMClassifier(params)
data_processor = ColumnTransformer([
    ('numerical_processing', numerical_processor, numerical_features),
    ('categorical_processing', categorical_processor, categorical_features),
    ('text_processing_0', text_processor_1, text_features[0]),
    ('text_processing_1', text_processor_1, text_features[1])
                                    ]) 
pipeline = Pipeline([
    ('data_processing', data_processor),
    ('lgbm', clf)
                    ])
pipeline.fit(X_train, y_train)

and the error is:

TypeError: Unknown type of parameter:boosting_type, got:dict

Here's my pipeline:

I basically have two textual features, both are some form of names on which im performing stemming mainly .

Any pointers would be highly appreciated.

Solution

You are setting up the classifier wrongly, this is giving you the error and you can easily try this before going to the pipeline:

params = {
'num_class':5,
'max_depth':8,
'num_leaves':200,
'learning_rate': 0.05,
'n_estimators':500
}

clf = LGBMClassifier(params)
clf.fit(np.random.uniform(0,1,(50,2)),np.random.randint(0,5,50))

Gives you the same error:

TypeError: Unknown type of parameter:boosting_type, got:dict

You can set up the classifier like this:

clf = LGBMClassifier(**params)

Then using an example, you can see it runs:

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer

numerical_processor = StandardScaler()
categorical_processor = OneHotEncoder()
numerical_features = ['A']
categorical_features = ['B']

data_processor = ColumnTransformer([('numerical_processing', numerical_processor, numerical_features),
('categorical_processing', categorical_processor, categorical_features)])

X_train = pd.DataFrame({'A':np.random.uniform(100),
'B':np.random.choice(['j','k'],100)})

y_train = np.random.randint(0,5,100)

pipeline = Pipeline([('data_processing', data_processor),('lgbm', clf)])

pipeline.fit(X_train, y_train)

Answered By - StupidWolf

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, February 1, 2022

[FIXED] LightGBM on Numerical+Categorical+Text Features >> TypeError: Unknown type of parameter:boosting_type, got:dict

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels