Issue
I trained simple RandomForestRegressor in sklearn:
from sklearn.ensemble import RandomForestRegressor
ran_for = RandomForestRegressor(n_estimators = 300, min_samples_split = 2,
random_state = RND, n_jobs = 20, max_depth = 8, verbose = 2)
ran_for.fit(X_c, y_c)
Then I saved model via joblib:
from joblib import dump
dump(ran_for, '/content/random_forest_regressor.joblib')
After that I restarted my kernel, and loaded model, saved previously:
from joblib import load
my_model = load('/content/random_forest_regressor.joblib')
I tested saved model on the sample from same dataset:
pred = my_model.predict(X_test)
And it looks like my saved model is working completely wrong, here is the predictions unique values and baplot:
print(pd.Series(pred).unique())
plt.figure(figsize = (10, 10))
pd.Series(pred).hist()
plt.show()
[892.52446705 599.29566532 539.45592338 903.74387156 601.12144516]
Am I doing smth wrong?
I am running this in Google Colab
Edit: As was suggested in comments, here is model prediction before saving:
pred = ran_for.predict(X_test)
print(pred[:20])
plt.figure(figsize = (10, 10))
pd.Series(pred).hist(bins = 1000).set_xlim([0, 5000])
plt.show()
Output Here you can see, that model is predicting values properly.
Solution
Turns out it's Google Colab issues. Tried same on my local machine - works fine.
Answered By - FTSlow
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.