Saturday, February 5, 2022

[FIXED] How to access all outputs from a single custom loss function in keras

February 05, 2022 keras, python, tensorflow No comments

Issue

I'm trying to reproduce the architecture of the network proposed in this publication in tensorFlow. Being a total beginner to this, I've been using this tutorial as a base to work on, using tensorflow==2.3.2.

To train this network, they use a loss which implies outputs from two branches of the network at the same time, which made me look towards custom losses function in keras. I've got that you can define your own, as long as the definition of the function looks like the following:

def custom_loss(y_true, y_pred):

I also understood that you could give other arguments like so:

def loss_function(margin=0.3):
    def custom_loss(y_true, y_pred):
        # And now you can use margin

You then just have to call these while compiling your model. When it comes to using multiple outputs, the most common approach seem to be the one proposed here, where you would give several losses functions, one being called for each of your output. However, I could not find a solution to give several outputs to a loss function, which is what I need here.

To further explain it, here is a minimal working example showing what I've tried, which you can try for yourself in this collab.

import os
import tensorflow as tf
import keras.backend as K
from tensorflow.keras import datasets, layers, models, applications, losses
from tensorflow.keras.preprocessing import image_dataset_from_directory

_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')

BATCH_SIZE = 32
IMG_SIZE = (160, 160)
IMG_SHAPE = IMG_SIZE + (3,)

train_dataset = image_dataset_from_directory(train_dir,
                                             shuffle=True,
                                             batch_size=BATCH_SIZE,
                                             image_size=IMG_SIZE)

validation_dataset = image_dataset_from_directory(validation_dir,
                                                  shuffle=True,
                                                  batch_size=BATCH_SIZE,
                                                  image_size=IMG_SIZE)

data_augmentation = tf.keras.Sequential([
  layers.experimental.preprocessing.RandomFlip('horizontal'),
  layers.experimental.preprocessing.RandomRotation(0.2),
])
preprocess_input = applications.resnet50.preprocess_input
base_model = applications.ResNet50(input_shape=IMG_SHAPE,
                                               include_top=False,
                                               weights='imagenet')
base_model.trainable = True
conv = layers.Conv2D(filters=128, kernel_size=(1,1))
global_pooling = layers.GlobalAveragePooling2D()
horizontal_pooling = layers.AveragePooling2D(pool_size=(1, 5))
reshape = layers.Reshape((-1, 128))

def custom_loss(y_true, y_pred):
    print(y_pred.shape)
    # Do some stuffs involving both outputs
    # Returning something trivial here for correct behavior
    return K.mean(y_pred)

inputs = tf.keras.Input(shape=IMG_SHAPE)
x = data_augmentation(inputs)
x = preprocess_input(x)
x = base_model(x, training=True)

first_branch = global_pooling(x)

second_branch = conv(x)
second_branch = horizontal_pooling(second_branch)
second_branch = reshape(second_branch)

model = tf.keras.Model(inputs, [first_branch, second_branch])
base_learning_rate = 0.0001
model.compile(optimizer=tf.keras.optimizers.Adam(lr=base_learning_rate),
              loss=custom_loss,
              metrics=['accuracy'])
model.summary()

initial_epochs = 10
history = model.fit(train_dataset,
                    epochs=initial_epochs,
                    validation_data=validation_dataset)

while doing so, I thought that the y_pred given to loss function would be a list, containing both outputs. However, while running it, what I've got in stdout was this:

Epoch 1/10
(None, 2048)
(None, 5, 128)

What I understand from this is that the loss function is called with every output, one by one, instead of being called once with all the outputs, which means I can't define a loss that would use both the outputs at the same time. Is there any way to achieve this?

Please let me know if I'm unclear, or if you need further details.

Solution

I had the same problem trying to implement Triplet_Loss function.

I refered to Keras's implementation for Siamese Network with Triplet Loss Function but something didnt work out and I had to implement the network by myself.

def get_siamese_model(input_shape, conv2d_filters):
    # Define the tensors for the input images
    anchor_input = Input(input_shape, name="Anchor_Input")
    positive_input = Input(input_shape, name="Positive_Input")
    negative_input = Input(input_shape, name="Negative_Input")

    body = build_body(input_shape, conv2d_filters)
    # Generate the feature vectors for the images
    encoded_a = body(anchor_input)
    encoded_p = body(positive_input)
    encoded_n = body(negative_input)

    distance = DistanceLayer()(encoded_a, encoded_p, encoded_n)
    # Connect the inputs with the outputs
    siamese_net = Model(inputs=[anchor_input, positive_input, negative_input],
                        outputs=distance)
    return siamese_net

and the "bug" was in DistanceLayer Implementation Keras posted (also in the same link above).

class DistanceLayer(tf.keras.layers.Layer):
    """
    This layer is responsible for computing the distance between the anchor
    embedding and the positive embedding, and the anchor embedding and the
    negative embedding.
    """

    def __init__(self, **kwargs):
        super().__init__(**kwargs)

    def call(self, anchor, positive, negative):
        ap_distance = tf.math.reduce_sum(tf.math.square(anchor - positive), axis=1, keepdims=True, name='ap_distance')
        an_distance = tf.math.reduce_sum(tf.math.square(anchor - negative), axis=1, keepdims=True, name='an_distance')
        return (ap_distance, an_distance)

When I was training the model, the loss function took only one of the vectors ap_distance or an_distance.

FINALLY, THE FIX WAS to concatenate the vectors together (along axis=1 this case) and on the loss function, take them apart:

    def call(self, anchor, positive, negative):
        ap_distance = tf.math.reduce_sum(tf.math.square(anchor - positive), axis=1, keepdims=True, name='ap_distance')
        an_distance = tf.math.reduce_sum(tf.math.square(anchor - negative), axis=1, keepdims=True, name='an_distance')
        return tf.concat([ap_distance, an_distance], axis=1)

on my custom loss:

def get_loss(margin=1.0):
    def triplet_loss(y_true, y_pred):
        # The output of the network is NOT A tuple, but a matrix shape (batch_size, 2),
        # containing the distances between the anchor and the positive example,
        # and the anchor and the negative example.
        ap_distance = y_pred[:, 0]
        an_distance = y_pred[:, 1]

        # Computing the Triplet Loss by subtracting both distances and
        # making sure we don't get a negative value.
        loss = tf.math.maximum(ap_distance - an_distance + margin, 0.0)
        # tf.print("\n", ap_distance, an_distance)
        # tf.print(f"\n{loss}\n")
        return loss

    return triplet_loss

Answered By - Jhon Margalit

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, February 5, 2022

[FIXED] How to access all outputs from a single custom loss function in keras

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels