Issue
Here is the function currently. Here, it removes from the MSE any values where y_true is less than a threshold (here, it is 0.1).
def my_loss(y_true,y_pred):
loss = tf.square(y_true-y_pred)
# if any y_true is less than a threshold (say 0.1)
# the element is removed from loss, and does not affect MSE
loss = tf.where(y_true<0.1)
# return mean of losses
return tf.reduce_mean(loss)
This one compiles, but the network doesn't ever learn to predict 0 well. Instead, I would like to eliminate only those values where both y_true and y_pred are less than some threshold. This is because it needs to first learn how to predict 0, before ignoring those points later on in the training.
This, however, does not compile.
def my_better_loss(y_true,y_pred):
loss = tf.square(y_true-y_pred)
# remove all elements where BOTH y_true & y_pred < threshold
loss = tf.where(y_true<0.1 and y_pred<0.1)
# return mean of losses
return tf.reduce_mean(loss)
It leads to the following error.
(0) Invalid argument: The second input must be a scalar, but it has shape [25,60,60]
[[{{node replica_1/customMSE/cond/switch_pred/_51}}]]
(1) Invalid argument: The second input must be a scalar, but it has shape [25,60,60]
[[{{node replica_1/customMSE/cond/switch_pred/_51}}]]
[[customMSE/cond/Squeeze/_59]]
(2) Invalid argument: The second input must be a scalar, but it has shape [25,60,60]
[[{{node replica_1/customMSE/cond/replica_1/customMSE/Less/_55}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_4715]
Function call stack:
train_function -> train_function -> train_function
Edit:
To be more specific. Say our threshold is 0.5:
y_true = [0.3, 0.4, 0.6, 0.7]
y_pred = [0.2, 0.7, 0.5, 1]
Then the loss function would compute mse with the first element removed, since both y_pred[0] and y_true[0] are less than threshold.
# MSE would be computed between
y_true = [0.4, 0.6, 0.7]
#and
y_pred = [0.7, 0.5, 1]
Solution
Most of the time it results in undesirable behaviours or errors if you use the python short-circuit and
operator in codes that convert into graph mode because the python short-circuit and
operator cannot be overloaded. To do element-wise and operation for tensors, use tf.math.logical_and
.
Besides, tf.where
is not necessary here and it is likely to be slower. Masking is preferred. Example codes:
@tf.function
def better_loss(y_true,y_pred):
loss = tf.square(y_true - y_pred)
# ignore elements where BOTH y_true & y_pred < 0.1
mask = tf.cast(tf.logical_or(y_true >= 0.1, y_pred >= 0.1) ,tf.float32)
loss *= mask
return tf.reduce_sum(loss) / tf.reduce_sum(mask)
Answered By - Laplace Ricky
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.