Issue
I'm confused about using cross cross_val_predict in a test data set.
I created a simple Random Forest model and used cross_val_predict to make predictions
from sklearn.ensemble import RandomForestClassifier
from sklearn.cross_validation import cross_val_predict, KFold
lr = RandomForestClassifier(random_state=1, class_weight="balanced", n_estimators=25, max_depth=6)
kf = KFold(train_df.shape[0], random_state=1)
predictions = cross_val_predict(lr,train_df[features_columns], train_df["target"], cv=kf)
predictions = pd.Series(predictions)
I'm confused on the next step here, How do I use is learnt above to make predictions on the test data set?
Solution
As @DmitryPolonskiy commented, the model has to be trained (with the fit
method) before it can be used to predict
.
# Train the model (a.k.a. `fit` training data to it).
lr.fit(train_df[features_columns], train_df["target"])
# Use the model to make predictions based on testing data.
y_pred = lr.predict(test_df[feature_columns])
# Compare the predicted y values to actual y values.
accuracy = (y_pred == test_df["target"]).mean()
cross_val_predict
is a method of cross validation, which lets you determine the accuracy of your model. Take a look at sklearn's cross-validation page.
Answered By - jakub
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.