Issue
from sklearn import ensemble
model = ensemble.RandomForestClassifier(n_estimators=10)
model.fit(x,y)
predictions = model.predict(new)
I know predict()
uses predict_proba()
to get the predictions, by computing the mean of the predicted class probabilities of the trees in the forest.
I want to get the result of predict_proba()
for the class predicted by the predict()
method.
What I'm doing is: first call predict()
like in the above code, and for the probability I'm extracting the max probability from the trees like so:
all_probabilities = model.predict_proba()
class_probabilities = np.array([])
for tree in all_probabilities:
class_probabilites = np.append(class_probabilities, tree.max())
Is this correct? If not, how can I extract the probability for the predicted class?
Solution
The predict_proba()
method returns a two-dimensional array, containing the estimated probabilities for each instance and each class:
import numpy as np
from sklearn.ensemble import RandomForestClassifier
X = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]])
y = np.array([0, 0, 1, 1])
model = RandomForestClassifier()
model.fit(X, y)
model.predict_proba(X)
array([[0.91, 0.09],
[0.91, 0.09],
[0.25, 0.75],
[0.05, 0.95]])
As you note, for each instance the predicted class is the class with the maximum probability. So one simple way to get the estimated probabilities for the predicted classes is to use np.max()
:
np.max(model.predict_proba(X), axis=1)
array([0.91, 0.91, 0.75, 0.95])
Answered By - Arne
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.