Sunday, August 7, 2022

[FIXED] Mnist model performing very badly on custom data

August 07, 2022 pytorch No comments

Issue

I have used the resnet50 prebuilt and pretrained model from pytorch, on the MNIST dataset,

from torch import nn
from torchvision.models import ResNet50_Weights, resnet50

class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()

    self.model = resnet50(weights=ResNet50_Weights.DEFAULT)

    self.model.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
    
    num_ftrs = self.model.fc.in_features
    self.model.fc = nn.Linear(num_ftrs, 10)

  def forward(self, x):
    return self.model(x)

it performs very well and after training for 10 epochs it has achieved an incredible 99.895% accuracy on the 50,000 test images.

model.eval()

with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in train_loader:
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
    
    print('Accuracy of the network on the {} train images: {} %'.format(50000, 100 * correct / total))

[out]: Accuracy of the network on the 50000 train images: 99.895 %

I have used pygame to easily create my own numbers to input to the model. I start off with a very basic program just placing circles while the left mouse button is held, then I save the generated image into a png format.

    if event.type == pg.MOUSEMOTION:
        if (drawing):
            mouse_position = pg.mouse.get_pos()
            pg.draw.circle(screen, color, mouse_position, w)
    elif event.type == pg.MOUSEBUTTONUP:
        mouse_position = (0, 0)
        drawing = False
        last_pos = None
    elif event.type == pg.MOUSEBUTTONDOWN:
        drawing = True

I convert the image to grayscale and scale it down to 28x28 and into a tensor using PIL and torch.PILToTensor().

image = Image.open("image.png").convert("L").resize((28,28),Image.Resampling.LANCZOS)

transform = Compose([
    PILToTensor(),
    Lambda(lambda image: image.view(-1, 1, 28, 28))
])

img_tensor = transform(image).to(torch.float)

Then I feed this image to the network. I get no errors or anything the model just predicts really badly. For example when I gave it this image of a 2 this code outputed:

with torch.no_grad():
    outputs = model(img_tensor)
    print(outputs)
    _, predicted = torch.max(outputs.data, 1)
    print(predicted)

[out]: tensor([[ 20.6237,   0.4952, -15.5033,   8.5165,   1.0938,   2.8278,   2.0153,
           3.2825,  -6.2655,  -0.6992]])
tensor([0])

The sureness is outputted as list with the sureness for each class 0, 1, 2, 3... so as you can see the sureness for "2" is actually negative, does anyone know why this could be and how I could solve it?

Thank you very much

Solution

I have solved this, the problem was that when I converted the image to a tensor it had values from 0-255 instead of 0-1, that's why the model was behaving so unpredictably.

Answered By - Jan Hrubec

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, August 7, 2022

[FIXED] Mnist model performing very badly on custom data

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels