Issue
I am trying to create a neural network and train my own Embeddings. The network has the following structure (PyTorch):
import torch.nn as nn
class MultiClassClassifer(nn.Module):
#define all the layers used in model
def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim):
#Constructor
super(MultiClassClassifer, self).__init__()
#embedding layer
self.embedding = nn.Embedding(vocab_size, embedding_dim)
#dense layer
self.hiddenLayer = nn.Linear(embedding_dim, hidden_dim)
#Batch normalization layer
self.batchnorm = nn.BatchNorm1d(hidden_dim)
#output layer
self.output = nn.Linear(hidden_dim, output_dim)
#activation layer
self.act = nn.Softmax(dim=1) #2d-tensor
#initialize weights of embedding layer
self.init_weights()
def init_weights(self):
initrange = 1.0
self.embedding.weight.data.uniform_(-initrange, initrange)
def forward(self, text):
embedded = self.embedding(text)
hidden_1 = self.batchnorm(self.hiddenLayer(embedded))
return self.act(self.output(hidden_1))
My training_iterator object looks like:
batch = next(iter(train_iterator))
batch.text_normalized_tweet[0]
tensor([[ 240, 538, 305, 73, 9, 780, 2038, 13, 48, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1],
[ 853, 57, 2, 70, 1875, 176, 466, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1],
...])
with shape: torch.Size([32, 25])
. 32= batch_size I used to create the training iterator with data.BucketIterator
and 25 = the sequences in the batch.
When I create a model instance:
INPUT_DIM = len(TEXT.vocab) #~5,000 tokens
EMBEDDING_DIM = 100
HIDDEN_DIM = 64
OUTPUT_DIM = 3 #target has 3 classes
model = MultiClassClassifer(INPUT_DIM, EMBEDDING_DIM, HIDDEN_DIM, OUTPUT_DIM)
and execute
model(batch.text_normalized_tweet[0]).squeeze(1)
I get back the following RuntimeError
RuntimeError: running_mean should contain 15 elements not 64
You may also find my Golab Notebook here.
Solution
I found a workaround based on the example given by @jhso (above).
INPUT_DIM = len(TEXT.vocab) #~5,000 tokens
EMBEDDING_DIM = 100
HIDDEN_DIM = 64
e = nn.Embedding(INPUT_DIM, EMBEDDING_DIM)
l = nn.Linear(EMBEDDING_DIM, HIDDEN_DIM)
b = nn.BatchNorm1d(HIDDEN_DIM)
soft = nn.Softmax(dim=1)
out = nn.Linear(HIDDEN_DIM, 3)
text, text_lengths = batch.text_normalized_tweet
y = e(text)
tensor, batch_size = nn.utils.rnn.pack_padded_sequence(y,text_lengths, batch_first=True)[0], nn.utils.rnn.pack_padded_sequence(y,text_lengths, batch_first=True)[1] #added rnn.pack_padded_sequence
y = b(l(tensor))
Added pack_padded_sequence()
method from utils.rnn
package which will take the embeddings as input. I also had to calculate both the text and the text_lengths since the way I created the training_iteror it returns 2 outputs (text, text_lenght).
Answered By - NikSp
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.