Issue
I have a 4D tensor of (2,1024,4,6)
. I want to use transposed convolution for upsampling spatial dimensions of such tensor by factor of two and reducing the channel numbers from 1024 to the 512. I want to have a 4D tensor like this (2,512,8,12)
. How can I do that? Also, is the transposed convolution a good idea for reducing the channel numbers? For example I used the following script but it is not working:
nn.ConvTranspose3d(in_channels=1024, out_channels=512, kernel_size=(1,2,2), stride=(1,3,2), padding=(0,1,1))
Solution
It seems you should be using ConvTranspose2d
instead of ConvTranspose3d
since your input tensor is 4D, shaped NCHW
.
There are different ways of getting to these results but one straightforward approach is to use a kernel size of 2
with a matching stride:
>>> conv = nn.ConvTranspose2d(1024, 512, kernel_size=2, stride=2)
Here is an inference example:
>>> conv(torch.rand(2, 1024, 4, 6)).shape
torch.Size([2, 512, 8, 12])
Answered By - Ivan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.