Issue
So if I run this code in Pytorch:
x = torch.ones(2,2, requires_grad=True)
x.add_(1)
I will get the error:
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
I understand that Pytorch does not allow inplace operations on leaf variables and I also know that there are ways to get around this restrictions. What I don't understand is the philosophy behind this rule. Why is it wrong to change a leaf variable with inplace operations?
Solution
As I understand it, any time you do a non-traditional operation on a tensor that was initialized with requires_grad=True
, Pytorch throws an error to make sure it was intentional. For example, you normally would only update a weight tensor using optimizer.step()
.
For another example, I ran into this issue when trying to update the values in a backprop-able tensor during network initialization.
self.weight_layer = nn.Parameter(data=torch.zeros(seq_length), requires_grad=True)
self.weight_layer[true_ids == 1] = -1.2
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
The problem is that, because requires_grad=True
, the network doesn't know that I'm still initializing the values. If this is what you are trying to do, wrapping the update in a torch.no_grad
block is one solution:
with torch.no_grad()
self.weight_layer = nn.Parameter(data=torch.zeros(seq_length), requires_grad=True)
self.weight_layer[true_ids == 1] = -1.2
Otherwise, you could just set requires_grad=True
after you finish initializing the Tensor:
self.weight_layer = nn.Parameter(data=torch.zeros(seq_length))
self.weight_layer[true_ids == 1] = -1.2
self.weight_layer.requires_grad = True
Answered By - Jacob Stern
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.