Issue
I use Pytorch
cosine similarity function as follows. I have two feature vectors and my goal is to make them dissimilar to each other. So, I thought I could minimum their cosine similarity. I have some doubts about the way I have coded. I appreciate your suggestions about the following questions.
I don't know why here are some negative values in
val1
?I have done three steps to convert
val1
to a scalar. Am I doing it in the right way? Is there any other way?To minimum the similarity, I have used
1/var1
. Is it a standard way to do this? Is it correct if I use1-var1
?def loss_func(feat1, feat2): cosine_loss = torch.nn.CosineSimilarity(dim=1, eps=1e-6) val1 = cosine_loss(feat1, feat2).tolist() # 1. calculate the absolute values of each element, # 2. sum all values together, # 3. divide it by the number of values val1 = 1/(sum(list(map(abs, val1)))/int(len(val1))) val1 = torch.tensor(val1, device='cuda', requires_grad=True) return val1
Solution
Do not convert your loss function to a list. This breaks autograd so you won't be able to optimize your model parameters using pytorch.
A loss function is already something to be minimized. If you want to minimize the similarity then you probably just want to return the average cosine similarity. If instead you want minimize the magnitude of the similarity (i.e. encourage the features to be orthogonal) then you can return the average absolute value of cosine similarity.
It seems like what you've implemented will attempt to maximize the similarity. But that doesn't appear to be in line with what you've stated. Also, to turn a minimization problem into an equivalent maximization problem you would usually just negate the measure. There's nothing wrong with a negative loss value. Taking the reciprocal of a strictly positive measure does convert it from minimization to a maximization problem, but also changes the behavior of the measure and probably isn't what you want.
Depending on what you actually want, one of these is likely to meet your needs
import torch.nn.functional as F
def loss_func(feat1, feat2):
# minimize average magnitude of cosine similarity
return F.cosine_similarity(feat1, feat2).abs().mean()
def loss_func(feat1, feat2):
# minimize average cosine similarity
return F.cosine_similarity(feat1, feat2).mean()
def loss_func(feat1, feat2):
# maximize average magnitude of cosine similarity
return -F.cosine_similarity(feat1, feat2).abs().mean()
def loss_func(feat1, feat2):
# maximize average cosine similarity
return -F.cosine_similarity(feat1, feat2).mean()
Answered By - jodag
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.