Monday, June 6, 2022

[FIXED] How to get intermediate output grad in Pytorch model

June 06, 2022 pytorch No comments

Issue

we can get loss of last layer by loss = loss_fn(y_pred, y_true), and results in a loss: Tensor

then we call loss.backward() to do back propagation.

after optimizer.step() we could see updated model.parameters()

taking below example

y = Model1(x) # with optimizer1
z = Model2(y) # with optimizer2
loss = loss_fn(z, z_true)
loss.backward()
optimizer2.optimize() # update Model2 parameters

# in order to update Model1 parameters I think we should do
y.backward(grad_tensor=the_output_gradient_from_Model2)
optimizer1.optimize()

How to get the intermediate back propagation result? e.g. the gradient of output grad, which will be taken by y_pred.backward(grad_tensor=grad).

Update: The solution is setting required_grad=True and take Tensor x.grad. Thanks for the answers.

PS: The scenario is I am doing a federated learning, the model is split into 2 parts. The first part takes input and forward to second part. And it need the second part to calculate the loss and back propagate the loss to first part, so that the first part takes the loss and do its own back propagation.

Solution

I will assume you're referring to intermediate gradients when you say "loss of a specific layer".

You can access the gradient of the layer with respect to the output loss by accessing the grad attribute on the parameters of your model which require gradient computation.

Here is a simplistic setup:

>>> f = nn.Sequential(
       nn.Linear(10,5), 
       nn.Linear(5,2), 
       nn.Linear(2, 2, bias=False), 
       nn.Sigmoid())

>>> x = torch.rand(3, 10).requires_grad_(True)
>>> f(x).mean().backward()

Navigate through all the parameters per layer:

>>> for n, c in f.named_children():
...    for p in c.parameters():
...       print(f'<{n}>:{p.grad}')

<0>:tensor([[-0.0054, -0.0034, -0.0028, -0.0058, -0.0073, -0.0066, -0.0037, -0.0044, 
             -0.0035, -0.0051],
            [ 0.0037,  0.0023,  0.0019,  0.0040,  0.0050,  0.0045,  0.0025,  0.0030,
              0.0024,  0.0035],
            [-0.0016, -0.0010, -0.0008, -0.0017, -0.0022, -0.0020, -0.0011, -0.0013,
             -0.0010, -0.0015],
            [ 0.0095,  0.0060,  0.0049,  0.0102,  0.0129,  0.0116,  0.0066,  0.0077,
              0.0063,  0.0091],
            [ 0.0005,  0.0003,  0.0002,  0.0005,  0.0006,  0.0006,  0.0003,  0.0004,
              0.0003,  0.0004]])
<0>:tensor([-0.0090,  0.0062, -0.0027,  0.0160,  0.0008])
<1>:tensor([[-0.0035,  0.0035, -0.0026, -0.0106, -0.0002],
            [-0.0020,  0.0020, -0.0015, -0.0061, -0.0001]])
<1>:tensor([-0.0289, -0.0166])
<2>:tensor([[0.0355, 0.0420],
            [0.0354, 0.0418]])

Answered By - Ivan

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, June 6, 2022

[FIXED] How to get intermediate output grad in Pytorch model

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels