Thursday, June 16, 2022

[FIXED] NotFittedError - Titanic Project Kaggle

June 16, 2022 jupyter, jupyter-notebook, kaggle, python No comments

Issue

I am trying different machine learning projects from Kaggle to make myself better. Here is the model that I am using:

from sklearn.ensemble import RandomForestClassifier

y = train_data["Survived"]

features = ["Pclass", "Sex", "SibSp", "Parch"]
X = pd.get_dummies(train_data[features])
X_test = pd.get_dummies(test_data[features])

model = RandomForestClassifier(n_estimators = 100, max_depth = 5, random_state = 1)
model.fit = (X, y)
predictions = model.predict(X_test)

output = pd.DataFrame({'PassengerId': test_data.PassengerId, 'Survived': predictions})
output.to_csv('submission.csv', index = False)
print('Your submission was successfully saved!')

Here is the error I get:

---------------------------------------------------------------------------
NotFittedError                            Traceback (most recent call last)
/tmp/ipykernel_33/1528591149.py in <module>
      9 forest_clf = RandomForestClassifier(n_estimators = 100, max_depth = 5, random_state = 1)
     10 forest_clf.fit = (X, y)
---> 11 predictions = forest_clf.predict(X_test)
     12 
     13 output = pd.DataFrame({'PassengerId': test_data.PassengerId, 'Survived': predictions})

/opt/conda/lib/python3.7/site-packages/sklearn/ensemble/_forest.py in predict(self, X)
    806             The predicted classes.
    807         """
--> 808         proba = self.predict_proba(X)
    809 
    810         if self.n_outputs_ == 1:

/opt/conda/lib/python3.7/site-packages/sklearn/ensemble/_forest.py in predict_proba(self, X)
    846             classes corresponds to that in the attribute :term:`classes_`.
    847         """
--> 848         check_is_fitted(self)
    849         # Check data
    850         X = self._validate_X_predict(X)

/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in check_is_fitted(estimator, attributes, msg, all_or_any)
   1220 
   1221     if not fitted:
-> 1222         raise NotFittedError(msg % {"name": type(estimator).__name__})
   1223 
   1224 

NotFittedError: This RandomForestClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

I think this is an example of the estimator cloning itself, but I am not sure which line is the issue here. This is the Titanic project that is seen on Kaggle, whose tutorial code I have copied amidst trying to learn. Any help is appreciated.

Solution

As @Blackgaurd pointed out just change model.fit = (X, y) to model.fit(X, y)

Your current code overwrites the fit method of your Random Forest Classifier.

Full code of yours with correction:

from sklearn.ensemble import RandomForestClassifier

y = train_data["Survived"]

features = ["Pclass", "Sex", "SibSp", "Parch"]
X = pd.get_dummies(train_data[features])
X_test = pd.get_dummies(test_data[features])

model = RandomForestClassifier(n_estimators = 100, max_depth = 5, random_state = 1)
model.fit(X, y) # <- line of code fixed
predictions = model.predict(X_test)

output = pd.DataFrame({'PassengerId': test_data.PassengerId, 'Survived': predictions})
output.to_csv('submission.csv', index = False)
print('Your submission was successfully saved!')

Answered By - petezurich

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, June 16, 2022

[FIXED] NotFittedError - Titanic Project Kaggle

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels