Issue
I am trying to do binary classification, and the one class (0) is approximately 1 third of the other class (1). when I run the raw data through a normal feed forward neural network, the accuracy is about 0.78. However, when I implement class_weights, the accuracy drops to about 0.49. The roc curve also seems to do better without the class_weights. Why does this happen, and how can i fix it?
II have already tried changing the model, and implementing regularization, and dropouts, etc. But nothing seems to change the overall accuracy
this is how i get my weights:
class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train), y_train)
class_weight_dict = dict(enumerate(class_weights))
Here is the results without the weights:
Here is with the weights:
I would expect the results to be better with the class_weights but the opposite seems to be true. Even the roc does not seem to do any better with the weights.
Solution
Due to the class imbalance a very weak baseline of always selecting the majority class will get accuracy of approximately 75%.
The validation curve of the network that was trained without class weights appears to show that it is picking a solution close to always selecting the majority class. This can be seen from the network not improving much over the validation accuracy it gets in the 1st epoch.
I would recommend looking into the confusion matrix, precision and recall metrics to get more information about which model is better.
Answered By - Svt
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.