Wednesday, June 29, 2022

[FIXED] AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_' when adding estimator to DataFrame

June 29, 2022 pandas, python, scikit-learn No comments

Issue

I am trying to store several estimators in a pandas DataFrame, and I keep running into this error:

AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_'

Initially, I though this was due to the fact that it was trying to copy the estimator to several rows, however, I was able to replicate the error with the following code:

pd.DataFrame({
    "foo" : "bar",
    "model" : RandomForestClassifier()
})

I also tried saving the estimator class in a dictionary and instantiating it in the dataFrame as seen below:

d = {"rf" : RandomForestClassifier}
pd.DataFrame({
    "foo" : "bar",
    "model" : d["rf"](random_state=100)
})

yet I still get the same error. So I'm thinking, if there is a solution for doing it as a single entry, then I'll be able to sclae that up. Does anyone have any ideas?

Edit to include stack trace:

  File "Local\Temp\ipykernel_27224\3809885946.py", line 7, in <cell line: 5>
    pd.DataFrame({
  File "Local\Programs\Python\Python310\lib\site-packages\pandas\core\frame.py", line 636, in __init__
    mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
  File "Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py", line 502, in dict_to_mgr
    return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
  File "Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py", line 120, in arrays_to_mgr
    index = _extract_index(arrays)
  File "Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py", line 659, in _extract_index
    raw_lengths.append(len(val))
  File "Local\Programs\Python\Python310\lib\site-packages\sklearn\ensemble\_base.py", line 195, in __len__
    return len(self.estimators_)
AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_'

Solution

The problem is that pandas is trying to explode the values of your dictionary into values for multiple rows, for which it checks the len of each, and RandomForestClassifier defines a __len__ method, as the number of fitted estimators (i.e. len(estimators_)).

In your one-row case, you can just wrap everything as singleton lists:

pd.DataFrame({
    "foo": ["bar"],
    "model": [RandomForestClassifier()],
})

Answered By - Ben Reiniger

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, June 29, 2022

[FIXED] AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_' when adding estimator to DataFrame

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels