Saturday, May 28, 2022

[FIXED] Failed to use transforms.ToTensor and transforms.Normalize to normalize the MNIST dataset

May 28, 2022 mnist, normalization, python, pytorch, tensor No comments

Issue

I used the following code to normalize the MNIST dataset, when I print the first sample, it fails to normalize as the max element is 255, not 1.

train_transform = transforms.Compose([
   transforms.ToTensor(), 
   transforms.Normalize((0.1307,), (0.3081,))])

train_set = torchvision.datasets.MNIST(
   root=data_dir, train=True, download=True, transform=train_transform)

When I check the range of the dataset input images:

print("min:%f max:%f" %(train_set.data.min(), train_set.data.max()))
output result:min:0.000000 max:255.000000

I was expecting [0, 1] instead, I don't know why that is. Is there something wrong?

Solution

The reason why you have a range of [0,255] is that you are accessing the underlying data of the dataset via the data attribute. This means the transforms have not been applied yet to the data.

>>> train_transform = T.Compose([T.ToTensor()])
>>> train_set = torchvision.datasets.MNIST(
       root='.', train=True, download=True, transform=train_transform)

Your access of the data:

>>> f'min:{train_set.data.min()} max:{train_set.data.max()}'
min:0.000000 max:255.000000

You have to access the dataset by its proper interface in order for the transform pipeline to take effect. To make sure you could unroll the entire dataset's inputs into a tensor and look at its range:

>>> x, y = zip(*train_set)
>>> x_ = torch.stack(x)
>>> f'min:{x_.min()} max:{x_.max()}'
min:tensor(0.) max:tensor(1.)

Answered By - Ivan

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, May 28, 2022

[FIXED] Failed to use transforms.ToTensor and transforms.Normalize to normalize the MNIST dataset

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels