Issue
Normally if I understood well PyTorch implementation of the Conv2D layer, the padding parameter will expand the shape of the convolved image with zeros to all four sides of the input. So, if we have an image of shape (6,6) and set padding = 2
and strides = 2
and kernel = (5,5)
, the output will be an image of shape (1,1). Then, padding = 2
will pad with zeroes (2 up, 2 down, 2 left and 2 right) resulting in a convolved image of shape (5,5)
However when running the following script :
import torch
from torch import nn
x = torch.ones(1,1,6,6)
y = nn.Conv2d(in_channels= 1, out_channels=1,
kernel_size= 5, stride = 2,
padding = 2,)(x)
I got the following outputs:
y.shape
==> torch.Size([1, 1, 3, 3]) ("So shape of convolved image = (3,3) instead of (5,5)")
y[0][0]
==> tensor([[0.1892, 0.1718, 0.2627, 0.2627, 0.4423, 0.2906],
[0.4578, 0.6136, 0.7614, 0.7614, 0.9293, 0.6835],
[0.2679, 0.5373, 0.6183, 0.6183, 0.7267, 0.5638],
[0.2679, 0.5373, 0.6183, 0.6183, 0.7267, 0.5638],
[0.2589, 0.5793, 0.5466, 0.5466, 0.4823, 0.4467],
[0.0760, 0.2057, 0.1017, 0.1017, 0.0660, 0.0411]],
grad_fn=<SelectBackward>)
Normally it should be filled with zeroes. I'm confused. Can anyone help please?
Solution
The input is padded, not the output. In your case, the conv2d layer will apply a two-pixel padding on all sides just before computing the convolution operation.
For illustration purposes,
>>> weight = torch.rand(1, 1, 5, 5)
Here we apply a convolution with
padding=2
:>>> x = torch.ones(1,1,6,6) >>> F.conv2d(x, weight, stride=2, padding=2) tensor([[[[ 5.9152, 8.8923, 6.0984], [ 8.9397, 14.7627, 10.8613], [ 7.2708, 12.0152, 9.0840]]]])
And we don't use any padding but instead apply it ourselves on the input:
>>> x_padded = F.pad(x, (2,)*4) tensor([[[[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 0., 1., 1., 1., 1., 1., 1., 0., 0.], [0., 0., 1., 1., 1., 1., 1., 1., 0., 0.], [0., 0., 1., 1., 1., 1., 1., 1., 0., 0.], [0., 0., 1., 1., 1., 1., 1., 1., 0., 0.], [0., 0., 1., 1., 1., 1., 1., 1., 0., 0.], [0., 0., 1., 1., 1., 1., 1., 1., 0., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]]]) >>> F.conv2d(x_padded, weight, stride=2) tensor([[[[ 5.9152, 8.8923, 6.0984], [ 8.9397, 14.7627, 10.8613], [ 7.2708, 12.0152, 9.0840]]]])
Answered By - Ivan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.