Monday, August 15, 2022

[FIXED] GridsearchSV auto fill of params in Hyper parameter tuning

August 15, 2022 hyperparameters, machine-learning, scikit-learn No comments

Issue

Is there a way to do hyper parameter tuning with the use of Gridsearch without defining each param (parameters) On a classifier/regressor? Like a auto hyper parameter tuning command. on documentation I found ParameterGrid but I did not fully understand what this is for.

Solution

In scikit-learn, you need to define both:

which hyperparameter you want to tune
which values of distributions you want to test for each hyperparameter

This is defined with a dictionary like param_grid = {'C': [1, 10], 'kernel': ['linear', 'rbf]} where the keys are the hyperparameter to be tuned, and the values are a list of values to be tested. When you give this dictionary to GridSearchCV, it automatically creates a grid of hyperparameter with all possible combinations, using ParameterGrid. For example:

from sklearn.model_selection import ParameterGrid
param_grid = {'C': [1, 10], 'kernel': ['linear', 'rbf']}
list(ParameterGrid(param_grid)) == (
   [{'C': 1, 'kernel': 'linear'}, {'C': 1, 'kernel': 'rbf'},
    {'C': 10, 'kernel': 'linear'}, {'C': 10, 'kernel': 'rbf'}])

This is the list of combinations of hyperparameters that are tested in the grid search. See also this example about how to use GridSearchCV, or the Automatic parameter searches section of the excellent Getting-started guide.

If you don't want to define yourself which hyperparameter to tune, or which values to test, you need an external definition of reasonable hyperparameter to tune that would work for any dataset. For example, you can take a look at "auto-ML" packages:

auto-sklearn An automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
autoviml Automatically Build Multiple Machine Learning Models with a Single Line of Code. Designed as a faster way to use scikit-learn models without having to preprocess data.
TPOT An automated machine learning toolkit that optimizes a series of scikit-learn operators to design a machine learning pipeline, including data and feature preprocessors as well as the estimators. Works as a drop-in replacement for a scikit-learn estimator.
Featuretools A framework to perform automated feature engineering. It can be used for transforming temporal and relational datasets into feature matrices for machine learning.
Neuraxle A library for building neat pipelines, providing the right abstractions to both ease research, development, and deployment of machine learning applications. Compatible with deep learning frameworks and scikit-learn API, it can stream minibatches, use data checkpoints, build funky pipelines, and serialize models with custom per-step savers.
EvalML EvalML is an AutoML library which builds, optimizes, and evaluates machine learning pipelines using domain-specific objective functions. It incorporates multiple modeling libraries under one API, and the objects that EvalML creates use an sklearn-compatible API.

Answered By - TomDLT

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, August 15, 2022

[FIXED] GridsearchSV auto fill of params in Hyper parameter tuning

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels