Issue
I re-designed a GAN which I have previously made with Keras. No Problem so far, however I notice that my model doesn't train correctly depending on the way I implement the scope reuse. Maybe someone could help me understanding what is happening:
Working version:
with tf.variable_scope('g/h0') as scope:
reuse_scope = scope if reuse else None
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None,
reuse=reuse_scope,
scope='g/h0'
)
h0 = tf.nn.relu(h0)
Not Working version:
with tf.variable_scope('g/h0') as scope:
if reuse:
scope.reuse_variables()
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None
)
h0 = tf.nn.relu(h0)
Both version lead to a working network, however the second one lead to a network that never updates. And I don't understand why the first version lead to a correct situation.
In TensorBoard, the graph appears quite different depending on the version I pick. I suspect the gradient to backpropagate incorrectly.
Is there any way doing this with the second version ? I find it a lot more understandable.
Solution
I think that you should try this way of defining your scope:
reuse = ... # True or False
with tf.variable_scope('g/h0') as scope:
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None,
reuse=reuse,
scope='fully_connected',
)
h0 = tf.nn.relu(h0)
If you set reuse as False
, you fully connected layer will be created "as usual". If you set it at True
, no additional parameter will be created but weights and bias will be reused from another scope (with the same name and where variables with the same name have been created, of course).
- The
reuse
parameter must beTrue
orFalse
(orNone
, naturally) - The
scope
parameter has nothing to do withreuse
. It is just the internal scope name. For example, if you setscope = 'g/h0'
, your weight parameter inside the fully connected layer will be'g/h0/g/h0/weights:0'
but if you do not set it, it will be'g/h0/fully_connected/weights:0'
.
A similar concern is addressed in this answer. It is roughly the same context as in your question, except that a conv2d layer is used and that the scope is not set explicitely.
EDIT:
I do not know if it is a bug or something normal but to use the reuse=True
variable in tf.contrib.layers.fully_connected
, you need to specify the scope
...
The complete working example:
import tensorflow as tf
## A value for z that you did not specify in your question
z = tf.placeholder(tf.float32, (2,1))
## First fully-connected layer with ut result
with tf.variable_scope('g/h0'):
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None,
reuse=None
)
h0 = tf.nn.relu(h0)
tf.global_variables()
# Returns [<tf.Variable 'g/h0/fully_connected/weights:0' shape=(1, 8192) dtype=float32_ref>, <tf.Variable 'g/h0/fully_connected/biases:0' shape=(8192,) dtype=float32_ref>]
# Second layer ith resuse = True
with tf.variable_scope('g/h0'):
h0 = tf.contrib.layers.fully_connected(
z,
4*4*512,
activation_fn=None,
reuse=True, scope = 'fully_connected'
)
h0 = tf.nn.relu(h0)
tf.global_variables()
# Returns [<tf.Variable 'g/h0/fully_connected/weights:0' shape=(1, 8192) dtype=float32_ref>, <tf.Variable 'g/h0/fully_connected/biases:0' shape=(8192,) dtype=float32_ref>]
# => the same parameters are used for both layers
Answered By - Pop
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.