Tuesday, November 2, 2021

[FIXED] Is it possible to use both MLPRegressor() and make_pipeline() within GridSearchCV()?

November 02, 2021 python, scikit-learn No comments

Issue

I'm trying to utilize make_pipeline() from scikit-learn along with GridSearchCV(). The Pipeline is simple and only includes two steps, a StandardScaler() and an MLPRegressor(). The GridSearchCV()is also pretty simple with the slight wrinkle that I'm using TimeSeriesSplit() for cross-validation.

The error I'm getting is as follows:

ValueError: Invalid parameter MLPRegressor for estimator Pipeline(steps=[('standardscaler', StandardScaler()),('mlpregressor', MLPRegressor())]). Check the list of available parameters with estimator.get_params().keys().

Can someone help me understand how I can rectify this problem so I can use the make_pipeline() framework with both GridSearchCV() and MLPRegressor() .

from sklearn.neural_network import MLPRegressor
   ...: from sklearn.preprocessing import StandardScaler
   ...: from sklearn.model_selection import TimeSeriesSplit, GridSearchCV
   ...: from sklearn.pipeline import make_pipeline
   ...: import numpy as np

In [2]: tscv = TimeSeriesSplit(n_splits = 5)

In [3]: pipe = make_pipeline(StandardScaler(), MLPRegressor())

In [4]: param_grid = {'MLPRegressor__hidden_layer_sizes': [(16,16,), (64,64,), (
   ...: 128,128,)], 'MLPRegressor__activation': ['identity', 'logistic', 'tanh',
   ...:  'relu'],'MLPRegressor__solver': ['adam','sgd']}

In [5]: grid = GridSearchCV(pipe, param_grid = param_grid, cv = tscv)
In [6]: features = np.random.random([1000,10])

In [7]: target = np.random.normal(0,10,1000)

In [8]: grid.fit(features, target)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-7233f9f2005e> in <module>
----> 1 grid.fit(features, target)

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64
     65             # extra_args > 0

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
    839                 return results
    840
--> 841             self._run_search(evaluate_candidates)
    842
    843             # multimetric is determined here because in the case of a callable

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/sklearn/model_selection/_search.py in _run_search(self, evaluate_candidates)
   1294     def _run_search(self, evaluate_candidates):
   1295         """Search all candidates in param_grid"""
-> 1296         evaluate_candidates(ParameterGrid(self.param_grid))
   1297
   1298

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/sklearn/model_selection/_search.py in evaluate_candidates(candidate_params, cv, more_results)
    793                               n_splits, n_candidates, n_candidates * n_splits))
    794
--> 795                 out = parallel(delayed(_fit_and_score)(clone(base_estimator),
    796                                                        X, y,
    797                                                        train=train, test=test,

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/joblib/parallel.py in __call__(self, iterable)
   1039             # remaining jobs.
   1040             self._iterating = False
-> 1041             if self.dispatch_one_batch(iterator):
   1042                 self._iterating = self._original_iterator is not None
   1043

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/joblib/parallel.py in dispatch_one_batch(self, iterator)
    857                 return False
    858             else:
--> 859                 self._dispatch(tasks)
    860                 return True
    861

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/joblib/parallel.py in _dispatch(self, batch)
    775         with self._lock:
    776             job_idx = len(self._jobs)
--> 777             job = self._backend.apply_async(batch, callback=cb)
    778             # A job can complete so quickly than its callback is
    779             # called before we get here, causing self._jobs to

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/joblib/_parallel_backends.py in apply_async(self, func, callback)
    206     def apply_async(self, func, callback=None):
    207         """Schedule a func to be run"""
--> 208         result = ImmediateResult(func)
    209         if callback:
    210             callback(result)

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/joblib/_parallel_backends.py in __init__(self, batch)
    570         # Don't delay the application, to avoid keeping the input
    571         # arguments in memory
--> 572         self.results = batch()
    573
    574     def get(self):

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/joblib/parallel.py in __call__(self)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 262             return [func(*args, **kwargs)
    263                     for func, args, kwargs in self.items]
    264

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/joblib/parallel.py in <listcomp>(.0)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 262             return [func(*args, **kwargs)
    263                     for func, args, kwargs in self.items]
    264

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/sklearn/utils/fixes.py in __call__(self, *args, **kwargs)
    220     def __call__(self, *args, **kwargs):
    221         with config_context(**self.config):
--> 222             return self.function(*args, **kwargs)

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/sklearn/model_selection/_validation.py in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, return_estimator, split_progress, candidate_progress, error_score)
    584             cloned_parameters[k] = clone(v, safe=False)
    585
--> 586         estimator = estimator.set_params(**cloned_parameters)
    587
    588     start_time = time.time()

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/sklearn/pipeline.py in set_params(self, **kwargs)
    148         self
    149         """
--> 150         self._set_params('steps', **kwargs)
    151         return self
    152

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/sklearn/utils/metaestimators.py in _set_params(self, attr, **params)
     52                 self._replace_estimator(attr, name, params.pop(name))
     53         # 3. Step parameters and other initialisation arguments
---> 54         super().set_params(**params)
     55         return self
     56

~/opt/miniconda3/envs/practice/lib/python3.9/site-packages/sklearn/base.py in set_params(self, **params)
    228             key, delim, sub_key = key.partition('__')
    229             if key not in valid_params:
--> 230                 raise ValueError('Invalid parameter %s for estimator %s. '
    231                                  'Check the list of available parameters '
    232                                  'with `estimator.get_params().keys()`.' %

ValueError: Invalid parameter MLPRegressor for estimator Pipeline(steps=[('standardscaler', StandardScaler()),
                ('mlpregressor', MLPRegressor())]). Check the list of available parameters with `estimator.get_params().keys()`.

Solution

Yes. Make the pipeline first. Then treat the pipeline as your model and pass it to GridSearchCV.

Your problem is in the following line (you had it mislabeled):

Replace MLPRegressor__ with mlpregressor__.

The Fix:

The pipeline named_step for MLPRegressor estimator was mislabeled as MLPRegressor__ in the param_grid.

Changing it to mlpregressor__ fixed the problem.

You may run and check it in this colab notebook.

# INCORRECT
param_grid = {
    'MLPRegressor__hidden_layer_sizes': [(16, 16,), (64, 64,), (128, 128,)], 
    'MLPRegressor__activation': ['identity', 'logistic', 'tanh', 'relu'],
    'MLPRegressor__solver': ['adam', 'sgd'],
}

# CORRECTED
param_grid = {
    'mlpregressor__hidden_layer_sizes': [(16, 16,), (64, 64,), (128, 128,)], 
    'mlpregressor__activation': ['identity', 'logistic', 'tanh', 'relu'],
    'mlpregressor__solver': ['adam', 'sgd'],
}

Note

The key to understand what was wrong here, was to observe the last two lines of the error stack.

ValueError: Invalid parameter MLPRegressor for estimator Pipeline(steps=[('standardscaler', StandardScaler()),
                ('mlpregressor', MLPRegressor())]). Check the list of available parameters with `estimator.get_params().keys()`.

Answered By - CypherX

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, November 2, 2021

[FIXED] Is it possible to use both MLPRegressor() and make_pipeline() within GridSearchCV()?

Issue

Solution

Solution

Note

0 comments:

Post a Comment

Popular Posts

Labels