Friday, May 27, 2022

[FIXED] MLPRegressor is giving far worse predictions than other regressors

May 27, 2022 neural-network, python, python-3.x, regression, scikit-learn No comments

Issue

I was testing out all of the sklearn regressors:

[compose.TransformedTargetRegressor(), AdaBoostRegressor(), BaggingRegressor(), ExtraTreesRegressor(), GradientBoostingRegressor(), RandomForestRegressor(), HistGradientBoostingRegressor(), LinearRegression(), Ridge(), RidgeCV(), SGDRegressor(), ARDRegression(), BayesianRidge(), HuberRegressor(), RANSACRegressor(), TheilSenRegressor(), PoissonRegressor(), TweedieRegressor(), PassiveAggressiveRegressor(), KNeighborsRegressor(), MLPRegressor(), svm.LinearSVR(), svm.NuSVR(), svm.SVR(), tree.DecisionTreeRegressor(), tree.ExtraTreeRegressor(), xgb.XGBRegressor(), xgb.XGBRFRegressor()]

on the iris dataset and I'm confused why MLPRegressor isn't working. I'm predicting the sepal length given the other 3 features and every single regressor with default hyperparameters has a test data MAE of .25 to .34, except for MLPRegressor which has a MAE of 1.0! I've tried doing things like scaling and hyperparameter tuning, but MLPRegressor is always wildly inaccurate.

EDIT: After comparing eschibli's code to mine, I figured out that the problem was my scaler. I was using this code

scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X)

Why is scaling the iris dataset making the MAE much worse?

Solution

While the default hyperparameters are wildly inappropriate for such a problem, I was able to obtain a MAE of 0.32 on the first run with them (varying the random seed produced values from 0.29 to 0.55 over five tries.) I expect choosing a (much!) smaller hidden layer, scaling the data, and/or tweaking the regularization parameter would produce much better and more consistant results.

X = iris.data[:, 1:]  # sepal length is the first feature
y = iris.data[:, 0]

X_train, X_test, y_train, y_test = train_test_split(X, y)
model = MLPRegressor() 
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(mean_absolute_error(y_test, y_pred))
# > 0.3256431821425728

Perhaps you could share the rest of your code?

Answered By - eschibli

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, May 27, 2022

[FIXED] MLPRegressor is giving far worse predictions than other regressors

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels