Issue
These two attentions are used in seq2seq modules. The two different attentions are introduced as multiplicative and additive attentions in this TensorFlow documentation. What is the difference?
Solution
They are very well explained in a PyTorch seq2seq tutorial.
The main difference is how to score similarities between the current decoder input and encoder outputs.
Answered By - J-min
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.