Saturday, February 5, 2022

[FIXED] Tensorflow softmax does not ignore masking value

February 05, 2022 deep-learning, keras, tensorflow No comments

Issue

I am reviving this github issue because I believe it is valid and needs to be explained. tf.keras has a masking layer with docs that reads

For each timestep in the input tensor (dimension #1 in the tensor), if all values in the input tensor at that timestep are equal to mask_value, then the timestep will be masked (skipped) in all downstream layers (as long as they support masking).

If any downstream layer does not support masking yet receives such an input mask, an exception will be raised.


# create padded zeros and change two valid entries.
inputs = np.zeros([1,5])
inputs[0,1] = 0.5
inputs[0,2] = 0.1
inputs = tf.Variable(inputs)
masked_inputs = tf.keras.layers.Masking(mask_value=0.0)(inputs)
with_masking = tf.keras.layers.Softmax()(masked_inputs)
without_masking = tf.keras.layers.Softmax()(inputs)

The two results are virtually identical

with_masking
<tf.Tensor: shape=(1, 5), dtype=float32, numpy=
array([[0.1737954 , 0.28654018, 0.19207363, 0.1737954 , 0.1737954 ]],
      dtype=float32)>
without_masking
<tf.Tensor: shape=(1, 5), dtype=float64, numpy=array([[0.1737954 , 0.28654017, 0.19207362, 0.1737954 , 0.1737954 ]])>

Expected behavior

I expected to just take softmax of the valid entries, similiar to

#Assign one large value 
inputs = np.zeros([1,2])
inputs[0,0] = 0.5
inputs[0,1] = 0.1
inputs = tf.Variable(inputs)
without_masking = tf.keras.layers.Softmax()(inputs)

without_masking
<tf.Tensor: shape=(1, 2), dtype=float64, numpy=array([[0.59868766, 0.40131234]])>

padded at the correct positions

with_masking
<tf.Tensor: shape=(1, 5), dtype=float32, numpy=
array([[0 , 0.59868766, 0.40131234, 0, 0 ]],
      dtype=float32)>

To ignore 0's in a softmax function, we could switch out massively negative numbers?

from tensorflow import __version__
__version__
'2.3.1'

Solution

I think this is already explained well in the Github issue you have linked. Underlying problem is that irrespective of whether an array is masked or not, softmax() still operates on 0.0 values and returns a non-zero value as mathematically expected (link).

The only way to get a zero output from a softmax() is to pass a very small float value. If you set the masked values to the minimum possible machine limit for float64, Softmax() of this value will be zero.

To get machine limit on float64 you need tf.float64.min which is equal to -1.7976931348623157e+308. More info about machine limits on this post.

Here is an implementation for your reference on tf.boolean_mask only, and the correct method of using tf.where for creating the mask and passing it to softmax() -

import tensorflow as tf

inputs = np.zeros([1,5])
inputs[0,1] = 0.5
inputs[0,2] = 0.1
inputs = tf.Variable(inputs)

#Returns only the elements that are not masked (2,)
with_boolmask = tf.boolean_mask(inputs, inputs!=0)
with_boolmask = tf.keras.layers.Softmax()(with_boolmask)

#Correct way to do it!
masked_inp = tf.where(inputs!=0, inputs, tf.float64.min) #<----
with_where = tf.keras.layers.Softmax()(masked_inp)

print('BOOLEAN MASK (NOT EXPECTED)')
print(with_boolmask)

print('')
print('MASKED INPUT - ')
print(masked_inp)
print('')
print('SOFTMAX OUTPUT')
print(with_where)

BOOLEAN MASK (NOT EXPECTED)
tf.Tensor([0.59868765 0.40131232], shape=(2,), dtype=float32)

MASKED INPUT - 
tf.Tensor(
[[-1.79769313e+308  5.00000000e-001  1.00000000e-001 -1.79769313e+308
  -1.79769313e+308]], shape=(1, 5), dtype=float64)

SOFTMAX OUTPUT
tf.Tensor([[0.         0.59868765 0.40131232 0.         0.        ]], shape=(1, 5), dtype=float32)

Answered By - Akshay Sehgal

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, February 5, 2022

[FIXED] Tensorflow softmax does not ignore masking value

Issue

Expected behavior

Solution

0 comments:

Post a Comment

Popular Posts

Labels