Issue
everyone! I was reading about Bert and wanted to do text classification with its word embeddings. I came across this line of code:
pooled_output, sequence_output = self.bert_layer([input_word_ids, input_mask, segment_ids])
and then:
clf_output = sequence_output[:, 0, :]
out = Dense(1, activation='sigmoid')(clf_output)
But I can't understand the use of pooled output. Doesn't sequence output contain all the information including the word embedding of ['CLS']? If so, why do we have pooled output?
Thanks in advance!
Solution
If you have given a sequence, "You are on StackOverflow". The sequence_output will give 768 embeddings of these four words. But, the pooled output will just give you one embedding of 768, it will pool the embeddings of these four words.
Answered By - Abhishek Verma
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.