Issue
In this pytorch code:
import torch
a = torch.tensor([2.], requires_grad=True)
y = torch.zeros((10))
gt = torch.zeros((10))
y[0] = a
y[1] = y[0] * 2
y.retain_grad()
loss = torch.sum((y-gt) ** 2)
loss.backward()
print(y.grad)
I want y[0]'s gradient to consist 2 parts:
- loss backward to y[0] itself.
- y[0] is used to calculate y[1], so it should have the part of y[1]'s gradient.
but when I run this code, there is only part 1 in y[0]'s gradient.
So how to make y[0]'s gradient to have all 2 parts?
edit: the output is:
tensor([4., 8., 0., 0., 0., 0., 0., 0., 0., 0.])
but I expect:
tensor([20., 8., 0., 0., 0., 0., 0., 0., 0., 0.])
Solution
y[0]
and y[1]
are two different elements, therefore they have different grad
. The only thing that "binds" them is the underlying relation to a
. If you inspect the grad of a
, you'll see:
print(a.grad)
tensor([20.])
That is, the two parts of the gradients are combined in a.grad
.
Answered By - Shai
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.