Issue
I have a list of probabilities dictating whether an output is a 1 or a 0 in a numpy array. I'm trying to split these probabilities into two separate arrays based on a certainty level of 75%. If either probability is above 75% it goes into the 'certain' array, and if neither cross that threshold, it goes into the 'uncertain' array.
For some reason, when I run this code it does not correctly differentiate between the two and proceeds to add all the instances to the 'certain' array.
Code:
probs = rfc.predict_proba(X_validate)
certain = []
uncertain = []
for i in probs[0:10]:
zero_val = i[0]
one_val = i[1]
if zero_val or one_val > 0.75:
certain.append(i)
else:
uncertain.append(i)
print(len(certain))
print(certain)
print(len(uncertain))
print(uncertain)
Here is the output:
10
[array([0., 1.]), array([1., 0.]), array([0.95, 0.05]), array(
[0.77, 0.23]), array([0.74, 0.26]), array([0.38, 0.62]), array
([0.11, 0.89]), array([1., 0.]), array([0.94, 0.06]), array([0
.19, 0.81])]
0
[]
What is causing every instance to be added to the 'certain' array regardless? Thanks!
Solution
zero_val or one_val > 0.75
is more or less equivalent to zero_val != 0 or one_val > 0.75
in this context, so zero_val
is essentially treated as a boolean flag. You need to write zero_val > 0.75 or one_val > 0.75
.
Answered By - Florian Weimer
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.