Issue
Here is my problem : when I look for the number of parameters in my first block, I see 36928 parameters (which is what I expect). But when I used this block to construct a model in an other class nn.Module
, there are 1792 extra parameters and I can't figure out where they come from.
I put some code below to illustrate.
class Conv2dBlock(torch.nn.Module):
def __init__(self, in_filters, out_filters, kernel_size=3):
super(Conv2dBlock, self).__init__()
self.conv2d_seq = torch.nn.Sequential()
for k in range(2):
self.conv2d_seq.append(torch.nn.Conv2d(in_channels=in_filters, out_channels=out_filters, kernel_size=kernel_size, padding='same'))
self.conv2d_seq.append(torch.nn.ReLU())
in_filters = out_filters
def forward(self, input):
out = self.conv2d_seq(input)
return out
En then, I use this block in an other nn.Module
:
class EncoderBlock(torch.nn.Module):
def __init__(self):
super(EncoderBlock, self).__init__()
self.conv2d = Conv2dBlock(3, 64)
self.maxpool = torch.nn.MaxPool2d(kernel_size=2)
def forward(self, input):
x = self.conv2d(input)
p = self.maxpool(x)
out = torch.nn.functional.dropout(p, 0.3)
return x, out
And finaly :
class UNet_model(torch.nn.Module):
def __init__(self):
super(UNet_model, self).__init__()
self.encoder_block1 = EncoderBlock()
def forward(self, input):
p1 = self.encoder_block1(input)
# I removed useless code
return p1
model = UNet_model()
summary(model, (3,128,128))
This last class constructs a model with 38 720 parameters, instead of 36 928. It seems there is an extra convolutional layer ((3,64, (3,3)) = 1792 params) applied twice to the input... I don't understand.
Can somebody take a look ?
Thanks !
Solution
First of all, torch.nn.Sequential()
doesn't support append
method, it should be changed to add_module
, like this:
for k in range(2):
self.conv2d_seq.add_module(f"conv_{k}",torch.nn.Conv2d(in_channels=in_filters, out_channels=out_filters, kernel_size=kernel_size, padding='same'))
self.conv2d_seq.add_module(f"relu_{k}",torch.nn.ReLU())
in_filters = out_filters
Second, if you run torchinfo summary
on the initial block you will see:
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
Conv2dBlock [1, 64, 64, 64] --
├─Sequential: 1-1 [1, 64, 64, 64] --
│ └─Conv2d: 2-1 [1, 64, 64, 64] 1,792
│ └─ReLU: 2-2 [1, 64, 64, 64] --
│ └─Conv2d: 2-3 [1, 64, 64, 64] 36,928
│ └─ReLU: 2-4 [1, 64, 64, 64] --
==========================================================================================
Total params: 38,720
Trainable params: 38,720
Non-trainable params: 0
Total mult-adds (M): 158.60
==========================================================================================
Input size (MB): 0.05
Forward/backward pass size (MB): 4.19
Params size (MB): 0.15
Estimated Total Size (MB): 4.40
==========================================================================================
So you can see that you have two conv layers (1,792 + 36,928)
as you specified 2 layers in your for loop: for k in range(2)
.
Answered By - Ilya Lasy
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.