Friday, February 18, 2022

[FIXED] Convolution and convolution transposed do not cancel each other

February 18, 2022 conv-neural-network, pytorch No comments

Issue

I'm trying to implement an autoencoder CNN. However, I have the following problem:

The last convolutional layer of my encoder is defined as follows:

Conv2d(128, 256, 3, padding=1, stride=2)

The input of this layer has shape (1, 128, 24, 24). Thus, the output has shape (1, 256, 12, 12).

After this layer, I have ReLU activation and BatchNorm. Neither of these changes the shape of the output.

Then I have a first ConvTranspose2d layer defined as:

ConvTranspose2d(256, 128, 3, padding=1, stride=2)

But the output of this layer has shape (1, 128, 23, 23).

As far as I know, if we use the same kernel size, stride, and padding in ConvTrapnpose2d as in the preceding Conv2d layer, then the output of this 2 layers block must have the same shape as its input.

So, my question is: what is wrong with my understanding? And how can I fix this issue?

Solution

I would first like to note that the nn.ConvTranspose2d layer is not the inverse of nn.Conv2d as explained in its documentation page:

it is not an actual deconvolution operation as it does not compute a true inverse of convolution

As far as I know, if we use the same kernel size, stride, and padding in ConvTranspose2d as in the preceding Conv2d layer, then the output of this 2 layers block must have the same shape as its input.

This is not always true! It depends on the input spatial dimensions.

In terms of spatial dimensions the 2D convolution will output:

out = [(x + 2p - d(k - 1) - 1)/s + 1]

where [x] is the whole part of x.

while the 2D transpose convolution will output:

out = (x - 1)s - 2p + d(k - 1) + op + 1

where x = input_dimension, out = output_dimension, k = kernel_size, s = stride, d = dilation, p = padding, and op = output_padding.

If you look at the convT o conv operator (i.e. convT(conv(x))) then you have:

out = (out_conv - 1)s - 2p + d(k - 1) + op + 1
    = ([(x + 2p - d(k - 1) - 1)/s + 1] - 1)s - 2p + d(k - 1) + op + 1

Which equals to x only if we have [(x + 2p - d(k - 1) - 1)/s + 1] = (x + 2p - d(k - 1) - 1)/s + 1, that is: if x is odd, in this case:

out = ((x + 2p - d(k - 1) - 1)/s + 1 - 1)s - 2p + d(k - 1) + op + 1
    = x + op

And out = x when op = 0.

Otherwise if x is even then:

out = x - 1 + op

And setting op = 1 gives out = x.

Here is an example:

>>> conv = nn.Conv2d(1, 1, 3, stride=2, padding=1)

>>> convT = nn.ConvTranspose2d(1, 1, 3, stride=2, padding=1)
>>> convT(conv(torch.rand(1, 1, 25, 25))).shape # x even
(1, 1, 25, 25) #<- out = x

>>> convT = nn.ConvTranspose2d(1, 1, 3, stride=2, padding=1, output_padding=1)
>>> convT(conv(torch.rand(1, 1, 24, 24))).shape # x odd
(1, 1, 24, 24) #<- out = x - 1 + op

Answered By - Ivan

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, February 18, 2022

[FIXED] Convolution and convolution transposed do not cancel each other

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels