Issue
I am a using CLIP model. Where I have two models. One model output is (20, 128, 256)
and the other one output is (20, 256)
.
image_model_output = (20, 256)
text_model_output = (20, 128, 256)
I use the following to calculate this
logits = (tf.matmul(caption_embeddings, image_embeddings, transpose_b=True))
so it will be like `(20, 256) * (256, 128, 20)`
it's ouput will be `(20, 128, 20)`
Similarly I calculate like this
images_similarity = tf.matmul(
image_embeddings, image_embeddings, transpose_b=True
)
(Output)--> (20, 256) * (256, 20) = (20,20)
and this
captions_similarity = tf.matmul(
caption_embeddings, caption_embeddings, transpose_b=True
)
(Output)--> (20, 128, 256) * (256, 128, 20) = (20, 128, 128)
The problem arises here
targets = keras.activations.softmax(
(captions_similarity + images_similarity) / (2 * self.temperature)
)
So do I need to change the activation function or there is any way to add these 3d matrices with different shapes? Sorry to technically explain like this but people with solid deep learning and machine learning backgorund will understand.
NOTE: After adding axis 1
like this tf.expand_dims(image_embeddings, axis=1)
the below part runs successfully
targets = keras.activations.softmax(
(captions_similarity + images_similarity) / (2 * self.temperature)
)
However after this there is a loss funtion like below
captions_loss = keras.losses.categorical_crossentropy(
y_true=targets, y_pred=logits, from_logits=True
)
which generates this error
ValueError: Shapes (2, 128, 128) and (2, 128, 1) are incompatible
Is it possible to solve this error?
Solution
To handle the above error I used a different loss funtion. I changed the code like below.
captions_loss = keras.losses.categorical_crossentropy(
y_true=targets, y_pred=logits, from_logits=True
)
To
captions_loss = keras.losses.kl_divergence(
y_true=targets, y_pred=logits
)
To save time of developers I have answered to my own. I am available to discuss on it further if someone is interested.
Answered By - Jacob
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.