Issue
I'm trying to create a CNN that obtains at least 80% accuracy on CIFAR10 data in 20 epochs.
class HWCNN(nn.Module):
def __init__(self, num_channels, num_classes):
super(HWCNN, self).__init__()
self.conv1 = nn.Conv2d(num_channels, 32, 3, padding=1)
self.conv2 = nn.Conv2d(32, 64, 3, stride=1, padding=1)
self.pool1 = nn.MaxPool2d(2)
self.conv3 = nn.Conv2d(64, 128, 3, stride=1, padding=1)
self.conv4 = nn.Conv2d(128, 128, 3, stride=1, padding=1)
self.pool2 = nn.MaxPool2d(2)
self.conv5 = nn.Conv2d(128, 256, 3, stride=1, padding=1)
self.conv6 = nn.Conv2d(256, 256, 3, stride=1, padding=1)
self.pool3 = nn.MaxPool2d(2)
nn.Flatten()
self.fc1 = nn.Linear(256*4*4, 512)
self.fc2 = nn.Linear(256, 512)
self.fc3 = nn.Linear(256, 10)
def forward(self, X):
x = F.relu(self.conv1(X))
x = F.relu(self.conv2(x))
x = self.pool1(x)
x = F.relu(self.conv3(x))
x = F.relu(self.conv4(x))
x = self.pool2(x)
x = F.relu(self.conv5(x))
x = F.relu(self.conv6(x))
x = self.pool3(x)
x = x.reshape(-1, 256*4*4)
x = F.relu(self.fc1(x))
x = self.fc2(x)
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
import torch.optim as optim
cuda = torch.device('cuda')
model = HWCNN(3, 10)
model.to(cuda)
optimizer = optim.SGD(model.parameters(), lr=0.1)
loss_fn = nn.CrossEntropyLoss()
epochs = 20
train_losses = []
valid_losses = []
best_valid_acc = 0
for epoch in range(0, epochs):
print('Epoch number ', epoch + 1)
train_loss = train(model, loss_fn, optimizer)
train_losses.append(train_loss)
train_accuracy = accuracy(model, train_loader)
valid_loss = validate(model, loss_fn, optimizer)
valid_losses.append(valid_loss)
valid_accuracy = accuracy(model, valid_loader)
if best_valid_acc < valid_accuracy:
best_valid_acc = valid_accuracy
training_stats(train_loss, train_accuracy, valid_loss, valid_accuracy)
print('Best validation accuracy', best_valid_acc)
If I run this, I get RuntimeError: mat1 and mat2 shapes cannot be multiplied (256x512 and 256x512)
. But the matrices have the same size, 256x512. Why is this happening? I tried modifying the reshape
arguments as well, but I can't get this to work. Any ideas? Thank you, in advance!
Solution
Declaring FC layers works like nn.Linear(num_in_features, num_out_features)
; see official documentation. In other words, you're telling fc1
to start with 25644 input features produce 512 output features. Then fc2
tries to start with 256 input features and produces 512 output features. This is a mismatch, since you're telling fc2
to expect 256 inputs but you actually passed it 512 inputs.
If you fix this mistake, you'll immediately see a new error: fc2
currently returns 512 output features, but then fc3
expects only 256 inputs.
You can fix both problems at once by replacing your line self.fc2 = nn.Linear(256, 512)
with self.fc2 = nn.Linear(512, 256)
.
Edit: You should also remove the line x = self.fc2(x)
. As is, your code is applying fc2
twice in a row. I'm assuming you just made a typo, since two consecutive FC layers are no better than one if you don't include an activation function like ReLU in between. This is also the cause of the new error mentioned in your comment, since the first fc2
call spits out 256 features and then the second fc2
call expects 512 inputs.
Bonus debugging tip: Try writing
print(model(torch.zeros((3,38,38))))
at any point after you've defined model
. That will produce an immediate error if you have dimensions mismatches like the ones in this question. Then you can comment out the last several lines in forward()
and see if the error goes away. This way you can quickly narrow down which layer is causing the dimension mismatch. Finding the bug becomes much easier once you know for sure exactly which line is causing the problem.
Incidentally, the "shapes cannot be multiplied" error is referring to matrix multiplication. If matrix A has dimensions m x n and matrix B has dimensions r x s, then the product A*B is only defined if n = r, which isn't satisfied for the dimensions in the error message.
Answered By - David Clyde
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.