Issue
My input size is [8,22]. A batch with 8 tokenized sentences with a length of 22. I dont want to use the default classifier.
model = RobertaForSequenceClassification.from_pretrained("xlm-roberta-large")
model.classifier=nn.Identity()
After model(batch) The size of result is torch.Size([8, 22, 1024]). I have no idea why. Should it be [8,1024]?
Solution
The model.classifier
object you have replaced used to be an instance of a RobertaClassificationHead
. If you take a look at its source code[1], the layer is hard-coded into indexing the first item of the second dimension of its input, which is supposed to be the [CLS]
token.
By replacing it with an Identity
you miss out on the indexing operation, hence your output shape.
Long story short, don't assume functionality you haven't verified when it comes to non-own code, huggingface in particular (lots of ad-hoc classes and spaghetti interfaces, least as far as I'm concerned).
[1] source
Answered By - KonstantinosKokos
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.