Issue
I read a bit about autoencoders and read that encoder transform data from $\mathbb{R}^{N \times D} \rightarrow \mathbb{R}^{N\times d}$$ space. I wrote CNN autoencoder with for example one layer like this:
self.conv1 = nn.Conv2d(1, 1, 5)
but when I apply x = F.relu(self.conv1(x))
of course I get an image with smaller dimensions (if I had the dimension torch.Size([1, 400, 1024])
at the beginning)
after the operation I have torch.Size([1, 396, 1020])
.
It is clear that convolution reduces the dimension of the image (that's all clear :) ), but I'm interested in whether there is a way to keep the number of features and then fill the others with zeros, and if that would be a step in the right direction, the question arises as to which features to fill with zeros .
P.S. I use only one channel because I want the final result to be in a 2d matrix, and my images are gray scale anyway.
Solution
I think you need to add a padding='same' option
self.conv1 = nn.Conv2d(1, 1, 5, padding='same')
This is the usual way of ensuring output dimensions don't get shrunk when applying convolutional layers. It pads the original image before applying the convolution, by default with zeros, rather than padding the result.
(If this were Tensorflow I would be certain, but it looks like torch works the same way. https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html)
Answered By - David Harris
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.