Issue
In the model I'm building I'm trying to improve performance by replacing the Flatten layer with global max pooling.
To check that shapes are in order I ran a single random sample through the net:
test = torch.rand((1, 3, 224, 224)) # [N, C, H, W]
foo = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.Conv2d(32, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.MaxPool2d(2)
)
foo2 = nn.Sequential(
nn.Conv2d(32, 64, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.Conv2d(64, 64, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.MaxPool2d(2)
)
foo3 = nn.Sequential(
nn.Conv2d(64, 128, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(128),
nn.Conv2d(128, 128, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(128),
nn.MaxPool2d(2)
)
l1 = nn.Sequential(
nn.Dropout(0.5),
nn.Linear(128, 1024),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(1024, 10)
)
r1 = foo(test)
print(r1.shape) # torch.Size([1, 32, 112, 112])
r2 = foo2(r1)
print(r2.shape) # torch.Size([1, 64, 56, 56])
r3 = foo3(r2)
print(r3.shape) # torch.Size([1, 128, 28, 28])
# applying global max pooling and reshaping the layer to [N, C]
flat = F.adaptive_max_pool2d(r3, (1, 1))
ff = flat.reshape(flat.size(0), -1)
print(ff.shape) # torch.Size([1, 128])
res = l1(ff)
print(res.shape) # torch.Size([1, 10])
Here all seems to work as expected.
My model class has these same layers with the forward method like so:
def forward(self, batch: torch.Tensor) -> torch.Tensor:
r1 = self.conv1(batch)
r2 = self.conv2(r1)
r3 = self.conv3(r2)
tmp = F.adaptive_max_pool2d(r3, (1, 1))
flat = r3.view(tmp.size(0), -1)
out = self.linear(flat)
return out
Unfortunately, when I try to run the actual images through (Fashion MNIST dataset) I get the error: mat1 and mat2 shapes cannot be multiplied (128x2048 and 128x1024)
My batch size is 128 but I don't understand where 2048 might be coming from. None of my layers should output anything of that shape.
The full error message is as follows:
RuntimeError Traceback (most recent call last)
/root/fashion_mnist.ipynb Cell 7 in <cell line: 1>()
----> 1 runner.train_model(epochs=80, batch_size=128, criterion=loss_fn, optimizer=optim)
/root/fashion_mnist.ipynb Cell 7 in RunModel.train_model(self, epochs, batch_size, criterion, optimizer, device)
113 t_ep = datetime.now()
115 # run train routine
--> 116 train_loss, train_acc = self._run_train(train_loader, criterion, optimizer)
117 self.train_losses[ep] = train_loss
118 self.train_accuracies[ep] = train_acc
/root/fashion_mnist.ipynb Cell 7 in RunModel._run_train(self, train_data, criterion, optimizer)
141 inputs, targets = inputs.cuda(), targets.cuda()
142 optimizer.zero_grad()
--> 144 outputs: torch.Tensor = self.model(inputs)
145 loss: torch.Tensor = criterion(outputs, targets)
147 loss.backward()
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1186, in Module._call_impl(self, *input, **kwargs)
1182 # If we don't have any hooks, we want to skip the rest of the logic in
1183 # this function, and just call forward.
1184 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1185 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1186 return forward_call(*input, **kwargs)
...
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x2048 and 128x1024)
Any ideas what's happening here?
The notebook is available here: https://colab.research.google.com/drive/1QGpSpUCbuDz-dktmLCv_YpG6LZjYZ1TM?usp=sharing
Solution
Use Flatten()
in the layers instead of view()
. So your linear layer should look like this:
self.linear = nn.Sequential(
nn.Flatten(),
nn.Dropout(0.5),
nn.Linear(128, 1024),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(1024, 10)
)
And your forward
function look like:
def forward(self, batch: torch.Tensor) -> torch.Tensor:
r1 = self.conv1(batch)
r2 = self.conv2(r1)
r3 = self.conv3(r2)
tmp = F.adaptive_max_pool2d(r3, (1, 1))
out = self.linear(tmp)
return out
I have tested it on colab
and it works fine.
Here is a summary output:
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 32, 32, 32] 896
ReLU-2 [-1, 32, 32, 32] 0
BatchNorm2d-3 [-1, 32, 32, 32] 64
Conv2d-4 [-1, 32, 32, 32] 9,248
ReLU-5 [-1, 32, 32, 32] 0
BatchNorm2d-6 [-1, 32, 32, 32] 64
MaxPool2d-7 [-1, 32, 16, 16] 0
Conv2d-8 [-1, 64, 16, 16] 18,496
ReLU-9 [-1, 64, 16, 16] 0
BatchNorm2d-10 [-1, 64, 16, 16] 128
Conv2d-11 [-1, 64, 16, 16] 36,928
ReLU-12 [-1, 64, 16, 16] 0
BatchNorm2d-13 [-1, 64, 16, 16] 128
MaxPool2d-14 [-1, 64, 8, 8] 0
Conv2d-15 [-1, 128, 8, 8] 73,856
ReLU-16 [-1, 128, 8, 8] 0
BatchNorm2d-17 [-1, 128, 8, 8] 256
Conv2d-18 [-1, 128, 8, 8] 147,584
ReLU-19 [-1, 128, 8, 8] 0
BatchNorm2d-20 [-1, 128, 8, 8] 256
MaxPool2d-21 [-1, 128, 4, 4] 0
Flatten-22 [-1, 128] 0
Dropout-23 [-1, 128] 0
Linear-24 [-1, 1024] 132,096
ReLU-25 [-1, 1024] 0
Dropout-26 [-1, 1024] 0
Linear-27 [-1, 10] 10,250
================================================================
Total params: 430,250
Trainable params: 430,250
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 2.76
Params size (MB): 1.64
Estimated Total Size (MB): 4.41
----------------------------------------------------------------
Trainer Output:
Epoch 1/80 completed in 0:00:32.994402. Train_loss: 1.0680, train accuracy: 0.6225 Test loss: 1.0435, test accuracy: 0.6271
Epoch 2/80 completed in 0:00:32.939861. Train_loss: 0.9726, train accuracy: 0.6578 Test loss: 0.9616, test accuracy: 0.6662
Epoch 3/80 completed in 0:00:32.811203. Train_loss: 0.9015, train accuracy: 0.6851 Test loss: 0.9015, test accuracy: 0.6883
Epoch 4/80 completed in 0:00:32.836747. Train_loss: 0.8361, train accuracy: 0.7119 Test loss: 0.8336, test accuracy: 0.7173
Answered By - rafathasan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.