Issue
I have RTX 3070. Somehow using autocast slows down my code.
torch.version.cuda prints 11.1, torch.backends.cudnn.version() prints 8005 and my PyTorch version is 1.9.0. I’m using Ubuntu 20.04 with Kernel 5.11.0-25-generic.
That’s the code I’ve been using:
torch.cuda.synchronize()
start = torch.cuda.Event(enable_timing=True)
end = torch.cuda.Event(enable_timing=True)
start.record()
for epoch in range(10):
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
inputs, labels = data
optimizer.zero_grad()
with torch.cuda.amp.autocast():
outputs = net(inputs)
oss = criterion(outputs, labels)
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
end.record()
torch.cuda.synchronize()
print(start.elapsed_time(end))
Without torch.cuda.amp.autocast(), 1 epoch takes 22 seconds, whereas with autocast() 1 epoch takes 30 seconds.
Solution
It turns out, my model was not big enough to utilize mixed precision. When I increased the in/out channels of convolutional layer, it finally worked as expected.
Answered By - dubistweltmeister
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.