Issue
From Transformers library I use LongformerModel, LongformerTokenizerFast, LongformerConfig
(all of them use from_pretrained("allenai/longformer-base-4096")
).
When I do
longformer(input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
I get such an error:
~/tfproject/tfenv/lib/python3.7/site-packages/transformers/modeling_longformer.py in forward(self, input_ids, token_type_ids, position_ids, inputs_embeds)
177 if inputs_embeds is None:
178 inputs_embeds = self.word_embeddings(input_ids)
--> 179 position_embeddings = self.position_embeddings(position_ids)
180 token_type_embeddings = self.token_type_embeddings(token_type_ids)
181
~/tfproject/tfenv/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
~/tfproject/tfenv/lib/python3.7/site-packages/torch/nn/modules/sparse.py in forward(self, input)
112 return F.embedding(
113 input, self.weight, self.padding_idx, self.max_norm,
--> 114 self.norm_type, self.scale_grad_by_freq, self.sparse)
115
116 def extra_repr(self):
~/tfproject/tfenv/lib/python3.7/site-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
1722 # remove once script supports set_grad_enabled
1723 _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 1724 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
1725
1726
IndexError: index out of range in self
Online I found that this might mean that my input to the model has more tokens than the model's max input size.
But I have checked and all inputs have exactly 4098 tokens (which is the model max length of input size) (padding has been applied). Tokenizer has the same vocab size as the model.
I have no idea what's wrong.
Solution
I have managed to fix this by reindexing my position_ids
.
When PyTorch was creating that tensor, for some reason some value in position_ids
was bigger than 4098.
I used:
position_ids = torch.stack([torch.arange(config.max_position_embeddings) for a in range(val_dataloader.batch_size)]).to(device)
to create position_ids
for the entire batch.
Bear in mind that it might not be the best solution. The problem might need some more debugging. But for a quick fix, it works.
Answered By - Karol
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.