Saturday, January 29, 2022

[FIXED] TypeError: estimator should be an estimator implementing 'fit' method

January 29, 2022 pandas, python, scikit-learn No comments

Issue

I solve the problem from Stepik:

One tree is good, but where are the guarantees that it is the best, or at least close to it? One of the ways to find a more or less optimal set of tree parameters is to iterate over a set of trees with different parameters and choose the appropriate one. For this purpose, there is a GridSearchCV class that iterates over each of the combinations of parameters among those specified for the model, trains it on the data and performs cross-validation. After that, the model with the best parameters is stored in the .best_estimator_ attribute. Now the task is to iterate over all the trees on the iris data according to the following parameters: maximum depth - from 1 to 10 levels the minimum number of samples for separation is from 2 to 10 minimum number of samples per sheet - from 1 to 10 and store the best tree in the variable best_tree. Name the variable with GridSearchCV search. Here is my solution:

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris


iris = load_iris()
X = iris.data
y = iris.target

parameters = {'max_depth': range(1, 10), 'min_samples_split': range(2, 10), 'min_samples_leaf': range(1, 10)}
search = GridSearchCV(iris, parameters)

search.fit(X, y)

best_tree = search.estimator

Why am I getting this error?:

Traceback (most recent call last):
  File "jailed_code", line 22, in <module>
    search.fit(X, y)
  File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 595, in fit
    self.estimator, scoring=self.scoring)
  File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/sklearn/metrics/scorer.py", line 342, in _check_multimetric_scoring
    scorers = {"score": check_scoring(estimator, scoring=scoring)}
  File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/sklearn/metrics/scorer.py", line 274, in check_scoring
    "'fit' method, %r was passed" % estimator)
TypeError: estimator should be an estimator implementing 'fit' method, {'data': array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2],
       [5.4, 3.9, 1.7, 0.4],
       [4.6, 3.4, 1.4, 0.3],
       [5. , 3.4, 1.5, 0.2],
       ...

Solution

You passed the dataset instead of an estimator. If you haven't already, take a look at this https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

This should work

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris

iris = load_iris()
X = iris.data
y = iris.target

parameters = {'max_depth': range(1, 10), 'min_samples_split': range(2, 10), 'min_samples_leaf': range(1, 10)}
search = GridSearchCV(estimator=DecisionTreeClassifier(),
                      param_grid=parameters)

search.fit(X, y)

search.cv_results_

Answered By - Braden Anderson

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, January 29, 2022

[FIXED] TypeError: estimator should be an estimator implementing 'fit' method

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels