Issue
I'm trying to reproduce the results of this code: https://github.com/vjayd/Image-Alignment-using-CNN the first problem I've faced is that, as far as I know, MNIST data are gray images and not color images so why he converted them to grayscale images using rgb2gray function
for img_train in glob.glob(trdata):
n = io.imread(img_train)
n = rgb2gray(n)
n= resize(n,(28,28))
train_x.append(n.reshape(1, 28, 28))
and what does (1, 1, 28, 28)
mean in this line
test_x = test_x.reshape(1, 1, 28, 28)
Solution
A Pytorch model mostly requires the first dimension of the input to be the batch size. So the shape of the image is (1, 28, 28). If you want to feed only one image to the model you still have to specify the batch size, which is of course 1 for one image. Therefore he adds the batch size dimension to the image by "reshaping" it to (1, 1, 28, 28).
Answered By - Theodor Peifer
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.