Tuesday, July 26, 2022

[FIXED] Passing word2vec embedding to a custom LSTM pytorch model

July 26, 2022 deep-learning, lstm, pytorch No comments

Issue

I have a set of input sentences. I am using the pretrained word2vec model from gensim to get the embedding of the input sentences. I want to pass these embeddings as input to a custom pytorch LSTM model

hidden_size = 32  
num_layers = 1
num_classes = 2

class customModel(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, num_classes):
        super(customModel, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.bilstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=False, bidirectional=True)
        self.fcl = nn.Linear(hidden_size*2, num_classes)

    def forward(self, x):
        # Set initial hidden and cell states 
        h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)
        c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)

        # Forward propagate LSTM
        out, hidden = self.bilstm(x, (h0, c0)) 
        fw_bilstm = out[-1, :, :self.hidden_size]
        bk_bilstm = out[0, :, :self.hidden_size]
        concat_fw_bw = torch.cat((fw_bilstm, bk_bilstm), dim = 1)
        fc = self.fcl(concat_fw_bw)
        x = F.softmax(F.relu(fc))
        return x

Now I initialize the model object.

model = customModel(300, hidden_size, num_layers, num_classes)

Get embedding for the input sentences

sentences = [['my', 'name', 'is', 'nad'], ['i', 'love', 'nlp', 'proc']]
embedding = create_embedding(sentences)
embedding_torch = torch.FloatTensor(embedding)

Now I want to pass these embeddings to the model to get the prediction

for item in embedding_torch:
    item = item.view((1, item.size()[0], item.size()[1]))
    for epoch in range(1):  
        tag_scores = model(item)  
        print (tag_scores)

Which throws me runtime error

RuntimeError: Expected hidden[0] size (2, 4, 32), got (2, 1, 32)

I am not sure why this is happening. My understanding is h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device) line is calculating the hidden dimension properly.

What am I missing? Please suggest.

Solution

The backbone of your model is nn.LSTM which expects inputs with size [sequence_length, batch_size, embedding_size]. On the other hand, the inputs you are providing the model have size [1, sequence_lenth, embedding_size]. What I would do is create the nn.LSTM as:

# With batch_first=True
self.bilstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)

That way, the model would expect the inputs to be of size [batch_size, sequence_length, embedding_size]. Then, instead of going through each element in the batch separately, do:

tag_scores = model(embedding_torch)

Answered By - gorjan

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, July 26, 2022

[FIXED] Passing word2vec embedding to a custom LSTM pytorch model

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels