Issue
I have used the resnet50 prebuilt and pretrained model from pytorch, on the MNIST dataset,
from torch import nn
from torchvision.models import ResNet50_Weights, resnet50
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.model = resnet50(weights=ResNet50_Weights.DEFAULT)
self.model.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
num_ftrs = self.model.fc.in_features
self.model.fc = nn.Linear(num_ftrs, 10)
def forward(self, x):
return self.model(x)
it performs very well and after training for 10 epochs it has achieved an incredible 99.895% accuracy on the 50,000 test images.
model.eval()
with torch.no_grad():
correct = 0
total = 0
for images, labels in train_loader:
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the {} train images: {} %'.format(50000, 100 * correct / total))
[out]: Accuracy of the network on the 50000 train images: 99.895 %
I have used pygame to easily create my own numbers to input to the model. I start off with a very basic program just placing circles while the left mouse button is held, then I save the generated image into a png format.
if event.type == pg.MOUSEMOTION:
if (drawing):
mouse_position = pg.mouse.get_pos()
pg.draw.circle(screen, color, mouse_position, w)
elif event.type == pg.MOUSEBUTTONUP:
mouse_position = (0, 0)
drawing = False
last_pos = None
elif event.type == pg.MOUSEBUTTONDOWN:
drawing = True
I convert the image to grayscale and scale it down to 28x28 and into a tensor using PIL and torch.PILToTensor().
image = Image.open("image.png").convert("L").resize((28,28),Image.Resampling.LANCZOS)
transform = Compose([
PILToTensor(),
Lambda(lambda image: image.view(-1, 1, 28, 28))
])
img_tensor = transform(image).to(torch.float)
Then I feed this image to the network. I get no errors or anything the model just predicts really badly. For example when I gave it this image of a 2 this code outputed:
with torch.no_grad():
outputs = model(img_tensor)
print(outputs)
_, predicted = torch.max(outputs.data, 1)
print(predicted)
[out]: tensor([[ 20.6237, 0.4952, -15.5033, 8.5165, 1.0938, 2.8278, 2.0153,
3.2825, -6.2655, -0.6992]])
tensor([0])
The sureness is outputted as list with the sureness for each class 0, 1, 2, 3... so as you can see the sureness for "2" is actually negative, does anyone know why this could be and how I could solve it?
Thank you very much
Solution
I have solved this, the problem was that when I converted the image to a tensor it had values from 0-255 instead of 0-1, that's why the model was behaving so unpredictably.
Answered By - Jan Hrubec
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.