Sunday, November 6, 2022

[FIXED] Understanding Keras Constraints

November 06, 2022 keras, tensorflow No comments

Issue

I have a question about tf.keras.constraints method.

(1)

class WeightsSumOne(tf.keras.constraints.Constraint):
      def __call__(self, w):
          return tf.nn.softmax(w, axis=0)

output = layers.Dense(1, use_bias=False, 
                      kernel_constraint = WeightsSumOne())(input)
                                       


(2)

intermediate = layers.Dense(1, use_bias = False)
intermediate.set_weights(tf.nn.softmax(intermediate.get_weights(), axis=0))

Do (1) and (2) perform the same process?

The reason why I ask the question is that Keras Documentation said that

They are per-variable projection functions applied to the target variable after each gradient update (when using fit()). (https://keras.io/api/layers/constraints/)

Unlike (1), I think that the constraint is applied before each gradient update in case of (2).

In my opinion, the gradients of weights of (1) and (2) are different, because the softmax is applied before the gradient calculation in the second case, but after the gradient calculation in the first case.

If I am wrong, I would appreciate it if you point out the wrong part.

Solution

They are not the same.

In the first case, the constraint is applied to the weights but in the second case its on the output of the dense layer (after multiplying with the inputs).

Construct a model in the first case:

inp = keras.Input(shape=(3,5))
out = keras.layers.Dense(1, use_bias=False, kernel_initializer=tf.ones_initializer(), 
         kernel_constraint= WeightsSumOne())(inp)

model = keras.Model(inp, out)
model.compile('adam', 'mse')

dummy run,

inputs = tf.random.normal(shape=(1,3,5))
outputs = tf.random.normal(shape=(1,3,1))
model.fit(inputs,outputs, epochs=1)

check the layer weights of model

print(model.layers[1].get_weights()[0])
#outputs
array([[0.2],
    [0.2],
    [0.2],
    [0.2],
    [0.2]]

Construct the model in the second case

inp = keras.Input(shape=(3,5))

out = keras.layers.Dense(1, activation='softmax', use_bias=False,
         kernel_initializer=tf.ones_initializer())(inp)

model1 = keras.Model(inp, out)
model1.compile('adam', 'mse')
#dummy run
model1.fit(inputs,outputs, epochs=1)

check the layer weights of model1,

print(model1.layers[1].get_weights()[0])
#outputs
array([[1.],
   [1.],
   [1.],
   [1.],
   [1.]],

We can see the layer weight of model is softmax of layer weight of model1

Answered By - vijayachandran mariappan

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, November 6, 2022

[FIXED] Understanding Keras Constraints

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels