Issue
I'm trying to get the ROC curve for my Neural Network. My network uses pytorch and im using sklearn to get the ROC curve. My model outputs the binary right and wrong and also the probability of the output.
output = model(batch_X)
_, pred = torch.max(output, dim=1)
I give the model both samples of input data (Am I doing this part right or should it only be 1 sample of the input data not both?) I take the probability ( the _ ) and the labels of what both inputs should be and feed it to sklearn like so
nn_fpr, nn_tpr, nn_thresholds = roc_curve( "labels go here" , "probability go here" )
Next I plot it with.
plt.plot(nn_fpr,nn_tpr,marker='.')
plt.ylabel('True Positive Rate')
plt.xlabel('False Positive Rate' )
plt.show()
It comes out very accurately which my model is (0.0167% wrong out of 108,000), but I have a concave graph and I have been told it's normally not supposed to be concave. (attached pictures)
I have been using Neural Nets for a while but I have never been asked to plot the ROC curve. So my question is, am I doing this right? Also should it be both labels or just one? All the examples I have seen for neural networks use Keras which if I remember right has a probability function. Therefore I don't know if PyTorch outputs the probability in the way sklearn want's it. For all the other examples I can find aren't for Neural Networks and they have a probability function built in.
Solution
Function roc_curve
expects array with true labels y_true
and array with probabilities for positive class y_score
(which usually means class 1). Therefore what you need is not
_, pred = torch.max(output, dim=1)
but simply (if your model outputs probabities, which is not default in pytorch)
probabilities = output[:, 1]
or (if your model output logits, which is common case in pytorch)
import torch.nn.functional as F
probabilities = F.softmax(output, dim=1)[:, 1]
After that, assuming that array with true labels called labels
, and has shape (N,)
, you call roc_curve
as:
y_score = probabilities.detach().numpy()
nn_fpr, nn_tpr, nn_thresholds = roc_curve(labels, y_score)
That way you'll get correct results (which wasn't the case with torch.max
)
As recommendation -- for binary classification I would suggest using model with sigmoid on the end and one output (probability of positive class), like:
model = nn.Sequential(nn.Linear(input_dim, hidden_dim),
nn.ReLU(),
nn.Linear(hidden_dim, 1),
nn.Sigmoid())
That way you'll train model with nn.BCELoss
, which expects probabilites (unlike nn.CrossEntropyLoss
which expects logits). Also code to get roc curve gets simpler:
probabilites = model(batch_X)
y_score = probabilites.squeeze(-1).detach().numpy()
fpr, tpr, threshold = roc_curve(labels, y_score)
Take a look at gist where ROC curve created for neural network classificator.
Answered By - draw
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.