Issue
Below is my custom LR Scheduler that subclasses tensorflow.keras.optimizers.schedules.LearningRateSchedule, got error TypeError: Cannot convert -0.5 to EagerTensor of dtype int64
. Really baffled as to why Eagertensor is relevant to a simple inverse square calculation for the return call of this custom class..
class lr_schedule(tensorflow.keras.optimizers.schedules.LearningRateSchedule):
def __init__(self, dim_embed, warmup_steps):
self.dim_embed = dim_embed
self.warmup_steps = warmup_steps
def __call__(self, step):
return (self.dim_embed ** -0.5) * min((step ** -0.5), step * (self.warmup_steps ** -1.5))
Not specifically relevant to this error, but this is a custom LR Scheduler that replicates warmup scheduler that is used at 'Attention is All You Need' paper..
Solution
I ran across this just yesterday. It's a type coercion issue since the value of step
being passed into __call__
is int64, so the math is converting everything to int64.
For your specific case, this should probably fix it:
class lr_schedule(tensorflow.keras.optimizers.schedules.LearningRateSchedule):
def __init__(self, dim_embed, warmup_steps):
self.dim_embed = tensorflow.cast(dim_embed, dtype=tensorflow.float32)
self.warmup_steps = tensorflow.cast(warmup_steps, dtype=tensorflow.float32)
def __call__(self, step):
step = tensorflow.cast(step, dtype=tensorflow.float32)
return (self.dim_embed ** -0.5) * min((step ** -0.5), step * (self.warmup_steps ** -1.5))
Answered By - Kintar
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.