Wednesday, March 2, 2022

[FIXED] LightGBMError: Do not support special JSON characters in feature name - The same code is working in jupyter but doesn't work in Spyder

March 02, 2022 jupyter, lightgbm, python, spyder No comments

Issue

I have the following code:

    most_important = features_importance_chi(importance_score_tresh, 
    df_user.drop(columns = 'CHURN'),churn)
    X = df_user.drop(columns = 'CHURN')
    churn[churn==2] = 1
    y = churn

    # handle undersample problem
    X,y = handle_undersampe(X,y)

    # train the model

    X=X.loc[:,X.columns.isin(most_important)].values
    y=y.values

    parameters = {
    'application': 'binary',
    'objective': 'binary',
    'metric': 'auc',
    'is_unbalance': 'true',
    'boosting': 'gbdt',
    'num_leaves': 31,
    'feature_fraction': 0.5,
    'bagging_fraction': 0.5,
    'bagging_freq': 20,
    'learning_rate': 0.05,
    'verbose': 0
    }

    # split data
    x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

    train_data = lightgbm.Dataset(x_train, label=y_train)
    test_data = lightgbm.Dataset(x_test, label=y_test)
    model = lightgbm.train(parameters,
                       train_data,
                       valid_sets=[train_data, test_data], 
                       **feature_name=most_important,**
                       num_boost_round=5000,
                       early_stopping_rounds=100)

and function which returns most_important parameter

def features_importance_chi(importance_score_tresh, X, Y):
    model = ExtraTreesClassifier(n_estimators=10)
    model.fit(X,Y.values.ravel())
    feature_list = pd.Series(model.feature_importances_,
                             index=X.columns)
    feature_list = feature_list[feature_list > importance_score_tresh]
    feature_list = feature_list.index.values.tolist()
    return feature_list

Funny thing is that this code in Spyder returns the following error

LightGBMError: Do not support special JSON characters in feature name.

but in jupyter works fine. I am able to print the list of most important features.

Any idea what could be the reason for this error?

Solution

You know what, this message is often found on LGBMClassifier () models, i.e. LGBM. Simply drop this line at the beginning as soon as you upload the data from the pandas and you have a problem with your head:

import re
df = df.rename(columns = lambda x:re.sub('[^A-Za-z0-9_]+', '', x))

Answered By - Wojciech Moszczyński

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, March 2, 2022

[FIXED] LightGBMError: Do not support special JSON characters in feature name - The same code is working in jupyter but doesn't work in Spyder

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels