Issue
After seeing the precision_recall_curve, if I want to set threshold = 0.4, how to implement 0.4 into my random forest model (binary classification), for any probability <0.4, label it as 0, for any >=0.4, label it as 1.
from sklearn.ensemble import RandomForestClassifier
random_forest = RandomForestClassifier(n_estimators=100, oob_score=True, random_state=12)
random_forest.fit(X_train, y_train)
from sklearn.metrics import accuracy_score
predicted = random_forest.predict(X_test)
accuracy = accuracy_score(y_test, predicted)
Documentation Precision recall
Solution
Assuming you are doing binary classification, it's quite easy:
threshold = 0.4
predicted_proba = random_forest.predict_proba(X_test)
predicted = (predicted_proba [:,1] >= threshold).astype('int')
accuracy = accuracy_score(y_test, predicted)
Answered By - Stev
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.