Issue
Having 3 neural networks connected as in the below code, how can we take two gradients from the initial network? The first gradient works but the second one returns None
tensor. It seems like they are not related to each other to get the gradient. How can I solve this problem?
with tf.GradientTape() as tape1:
with tf.GradientTape() as tape2:
output1 = NN_model1(input1, training=True)
output2 = NN_model2(output1, training=True)
output3 = NN_model3([input1, output1, output2], training=True)
loss1 = -tf.math.reduce_mean(output3)
loss2 = -tf.math.reduce_mean(output2)
grad1 = tape2.gradient(loss1, NN_model1.trainable_variables)
grad2 = tape1.gradient(loss2, grad1)
optimizer.apply_gradients(zip(grad2, NN_model1.trainable_variables))
Solution
I think correct approach should be as follows:
with tf.GradientTape() as tape:
output1 = NN_model1(input1, training=True)
output2 = NN_model2(output1, training=True)
output3 = NN_model3([input1, output1, output2], training=True)
loss1 = -tf.math.reduce_mean(output3)
loss2 = -tf.math.reduce_mean(output2)
grad = tape.gradient([loss1, loss2], NN_model1.trainable_variables)
optimizer.apply_gradients(zip(grad, NN_model1.trainable_variables))
Answered By - M.Innat
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.