Issue
What is the interpretation of the "value" of each node of a decision tree created with sklearn? I thought the numbers in "value" were supposed to add up to "samples," but as you can see from the image, mine do not. (That is a picture of just one node, but they are all like that.) I know this must have something to do with the class weights I applied, because when I make a decision tree without weighting, the values add up to the samples. But since 10% of my data is a '1' and 90% is a '0' for the target variable, I assigned class weights of {0:0:10, 1:0.90} to compensate for the imbalance in the data. Is it supposed to be the other way around?
Please help me understand how to interpret each node in the decision tree. Thanks!
Solution
The interpretation of value
is just the sum of the samples times their respective weights.
We can infer in your case that the 254.5 is the class weighted as 0.1
. So, that means we have 2545 samples of that class (because 2545 * 0.1 = 254.5
). Similarly, 20 * 0.9 = 18
, so we have 20 samples of the class weighted at 0.9. Together, the result is 2545 + 20 = 2565 samples, which is equal to your samples.
In the default case, sample weights are all 1, meaning value
will sum to the number of samples.
I would recommend using integer weights {0:1, 1:9}
, as you should avoid using floats unless necessary.
Answered By - Kraigolas
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.