Issue
In PyTorch, many methods of a tensor exist in two versions - one with a underscore suffix, and one without. If I try them out, they seem to do the same thing:
In [1]: import torch
In [2]: a = torch.tensor([2, 4, 6])
In [3]: a.add(10)
Out[3]: tensor([12, 14, 16])
In [4]: a.add_(10)
Out[4]: tensor([12, 14, 16])
What is the difference between
torch.add
andtorch.add_
torch.sub
andtorch.sub_
- ...and so on?
Solution
You have already answered your own question that the underscore indicates in-place operations in PyTorch. However I want to point out briefly why in-place operations can be problematic:
First of all on the PyTorch sites it is recommended to not use in-place operations in most cases. Unless working under heavy memory pressure it is more efficient in most cases to not use in-place operations.
https://pytorch.org/docs/stable/notes/autograd.html#in-place-operations-with-autogradSecondly there can be problems calculating the gradients when using in-place operations:
Every tensor keeps a version counter, that is incremented every time it is marked dirty in any operation. When a Function saves any tensors for backward, a version counter of their containing Tensor is saved as well. Once you access
self.saved_tensors
it is checked, and if it is greater than the saved value an error is raised. This ensures that if you’re using in-place functions and not seeing any errors, you can be sure that the computed gradients are correct. Same source as above.
Here is a shot and slightly modified example taken from the answer you've posted:
First the in-place version:
import torch
a = torch.tensor([2, 4, 6], requires_grad=True, dtype=torch.float)
adding_tensor = torch.rand(3)
b = a.add_(adding_tensor)
c = torch.sum(b)
c.backward()
print(c.grad_fn)
Which leads to this error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-27-c38b252ffe5f> in <module>
2 a = torch.tensor([2, 4, 6], requires_grad=True, dtype=torch.float)
3 adding_tensor = torch.rand(3)
----> 4 b = a.add_(adding_tensor)
5 c = torch.sum(b)
6 c.backward()
RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.
Secondly the non in-place version:
import torch
a = torch.tensor([2, 4, 6], requires_grad=True, dtype=torch.float)
adding_tensor = torch.rand(3)
b = a.add(adding_tensor)
c = torch.sum(b)
c.backward()
print(c.grad_fn)
Which works just fine - output:
<SumBackward0 object at 0x7f06b27a1da0>
So as a take-away I just wanted to point out to carefully use in-place operations in PyTorch.
Answered By - MBT
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.