Issue
I am studying NLP and trying to make a model for classifying sentences. I am creating my class with a model but I get an error saying that the input should be of type Tensor, not tuple. I use 4.21.2 transformers version.
class BertClassificationModel(nn.Module):
def __init__(self, bert_model_name, num_labels, dropout=0.1):
super(BertClassificationModel, self).__init__()
self.bert = BertForSequenceClassification.from_pretrained(bert_model_name, return_dict=False)
self.dropout = nn.Dropout(dropout)
self.classifier = nn.Linear(768, num_labels)
self.num_labels = num_labels
def forward(self, input_ids, attention_mask=None, token_type_ids=None):
pooled_output = self.bert(input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
pooled_output = self.dropout(pooled_output)
logits = self.classifier(pooled_output)
return logits
TypeError: dropout(): argument 'input' (position 1) must be Tensor, not tuple
Solution
The issue you face, is that the output of self.bert
is not a tensor but a tuple:
from transformers import BertForSequenceClassification, BertTokenizer
bert_model_name = "bert-base-cased"
t = BertTokenizer.from_pretrained(bert_model_name)
m = BertForSequenceClassification.from_pretrained(bert_model_name, return_dict=False)
o=m(**t("test test", return_tensors="pt"))
print(type(o))
Output:
tuple
I personally do not recommend using return_dict=False
as the code becomes more difficult to read. But changing this parameter doesn't help in your case, as you want to use the pooler output which is removed by the classification head of BertForSequenceClassification (the output of BertForSequenceClassification is listed here).
You already wrote in your own answer, that you don't intend to use the classification head of BertForSequenceClassification and you can therefore load BertModel directly (instead of initializing BertForSequenceClassification
and only using BERT as you did with: BertForSequenceClassification.from_pretrained(bert_model_name, return_dict=True).bert
):
from torch import nn
from transformers import BertModel, BertTokenizer
class BertClassificationModel(nn.Module):
def __init__(self, bert_model_name, num_labels, dropout=0.1):
super(BertClassificationModel, self).__init__()
self.bert = BertModel.from_pretrained(bert_model_name)
self.dropout = nn.Dropout(dropout)
self.classifier = nn.Linear(768, num_labels)
self.num_labels = num_labels
def forward(self, input_ids, attention_mask=None, token_type_ids=None):
pooled_output = self.bert(input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids).pooler_output
pooled_output = self.dropout(pooled_output)
logits = self.classifier(pooled_output)
return logits
m = BertClassificationModel("bert-base-cased",4, 0.1)
o = m(**t("test test", return_tensors="pt"))
print(o.shape)
Output:
torch.Size([1, 4])
Answered By - cronoik
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.