Issue
I am trying to store several estimators in a pandas DataFrame, and I keep running into this error:
AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_'
Initially, I though this was due to the fact that it was trying to copy the estimator to several rows, however, I was able to replicate the error with the following code:
pd.DataFrame({
"foo" : "bar",
"model" : RandomForestClassifier()
})
I also tried saving the estimator class in a dictionary and instantiating it in the dataFrame as seen below:
d = {"rf" : RandomForestClassifier}
pd.DataFrame({
"foo" : "bar",
"model" : d["rf"](random_state=100)
})
yet I still get the same error. So I'm thinking, if there is a solution for doing it as a single entry, then I'll be able to sclae that up. Does anyone have any ideas?
Edit to include stack trace:
File "Local\Temp\ipykernel_27224\3809885946.py", line 7, in <cell line: 5>
pd.DataFrame({
File "Local\Programs\Python\Python310\lib\site-packages\pandas\core\frame.py", line 636, in __init__
mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
File "Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py", line 502, in dict_to_mgr
return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
File "Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py", line 120, in arrays_to_mgr
index = _extract_index(arrays)
File "Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py", line 659, in _extract_index
raw_lengths.append(len(val))
File "Local\Programs\Python\Python310\lib\site-packages\sklearn\ensemble\_base.py", line 195, in __len__
return len(self.estimators_)
AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_'
Solution
The problem is that pandas is trying to explode the values of your dictionary into values for multiple rows, for which it checks the len
of each, and RandomForestClassifier
defines a __len__
method, as the number of fitted estimators (i.e. len(estimators_)
).
In your one-row case, you can just wrap everything as singleton lists:
pd.DataFrame({
"foo": ["bar"],
"model": [RandomForestClassifier()],
})
Answered By - Ben Reiniger
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.