Issue
I have the following class for a neural network:
class NN_test(nn.Module):
def __init__(self):
super().__init__()
self.hidden1 = nn.Linear(32, 32)
self.act1 = nn.ReLU()
self.hidden2 = nn.Linear(32, 8)
self.act2 = nn.ReLU()
self.output = nn.Linear(8, 1)
self.act_output = nn.Sigmoid()
def forward(self, x):
x = self.act1(self.hidden1(x))
x = self.act2(self.hidden2(x))
x = self.act_output(self.output(x))
return x
model = NN_test()
model = model.to(torch.float64)
Also the optimizer and the loss counting function, as well as the learning cycle inside which I split the batch into 32 pieces and transfer them to the neural network.
# Create loss function
loss_fn = nn.L1Loss()
# Create optimizer
optimizer = torch.optim.SGD(params=model.parameters(), # optimize newly created model's parameters
lr=0.01)
torch.manual_seed(42)
BATCH_SIZE = 32
# Set the number of epochs
epochs = 1000
# Put data on the available device
# Without this, error will happen (not all model/data on device)
X_train = x_scaled.to(device)
X_test = x_scaled_test.to(device)
y_train = y_scaled.to(device)
y_test = y_scaled_test.to(device)
for epoch in range(epochs):
for batch in range(0,len(X_train),BATCH_SIZE):
### Training
model.train() # train mode is on by default after construction
# 1. Forward pass
y_pred = model(X_train[batch:batch+BATCH_SIZE])
# 2. Calculate loss
loss = loss_fn(y_pred, y_train[batch:batch+BATCH_SIZE])
# 3. Zero grad optimizer
optimizer.zero_grad()
# 4. Loss backward
loss.backward()
# 5. Step the optimizer
optimizer.step()
### Testing
model.eval() # put the model in evaluation mode for testing (inference)
# 1. Forward pass
with torch.inference_mode():
test_pred = model(X_test[batch:batch+BATCH_SIZE])
# 2. Calculate the loss
test_loss = loss_fn(test_pred, y_test[batch:batch+BATCH_SIZE])
if epoch % 100 == 0:
print(f"Epoch: {epoch} | Train loss: {loss} | Test loss: {test_loss}")
This code gives the following error:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x1 and 32x32)
Yet i'm pretty sure that 32x1 matrix is multipliable by 32x32 matrix, so what's going on here.
Solution
There isn't enough information in the question to determine, but here is my best guess.
It looks like your input is a batch of items of shape (32, 1)
, which is sent to a (32, 32)
linear layer. You need the input size of the linear layer to match the number of features in the input, which in this case is 1
. Your layer should be nn.Linear(1, 32)
.
The matmul happens between the last dimension of the input and the first dimension of the linear layer. The first dimension of the input is the batch size, which is irrelevant to the matmul operation.
Answered By - Karl
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.