Issue
I'm trying to build a SIMON model based of this example(link). Below is how I'm building the 2 models: (1) sentence encoder (sentEncoder) which feeds into (2) doc encoder (docModel).
When I try to fit, I'm getting the following error.
Input tensor must be of rank 3, 4 or 5 but was 2.
My input is of shape (3003, 30, 28), i.e. (samples,sent max length, one-hot encoded character).
maxLength = 30
max_cells = 3003
charMap = {'a': 1,'b': 2, 'c': 3,'d': 4,'e': 5,'f': 6,
'g': 7,'h': 8, 'i': 9, 'j': 10, 'k': 11,
'l': 12,'m': 13, 'n': 14, 'o': 15, 'p': 16,
'q': 17,'r': 18, 's': 19, 't': 20, 'u': 21,
'v': 22,'w': 23, 'x': 24, 'y': 25, 'z': 26,
' ': 27}
maxChars = len(charMap)+1
x_train = np.zeros((max_cells, maxLength, maxChars), dtype='int32')
y_train = np.zeros((max_cells, 3), dtype='int32')
def buildSentModel(sentModelInput):
layer = Conv1D(10,
5,
padding='valid',
activation='relu',
strides=1)(sentModelInput)
layer = Bidirectional(LSTM(units=12, return_sequences=False))(layer)
return Model(input=sentModelInput, output=layer)
def buildDocModel(sentModel, docInput):
layer = TimeDistributed(sentModel)(docInput)
#layer = Flatten()(layer)
layer = Dense(3, activation='sigmoid')(layer)
return Model(input=docInput, output=layer)
sentModelInput = Input(shape=(30,28), dtype='float32')
sentModel = buildSentModel(sentModelInput);
docModel = buildDocModel(sentModel, sentModelInput);
docModel.compile(optimizer="adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
docModel.fit(x_train, y_train, steps_per_epoch=20,epochs=100, shuffle=True)
Here is the whole error:
File "C:\temp\Simon\TempSimonNames.py", line 107, in model = buildDocModel(sentModel, sentModelInput);
File "C:\temp\Simon\TempSimonNames.py", line 94, in buildDocModel layer = TimeDistributed(sentModel)(docInput)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 75, in symbolic_fn_wrapper return func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\base_layer.py", line 489, in call output = self.call(inputs, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\wrappers.py", line 250, in call y = self.layer.call(inputs, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\network.py", line 583, in call output_tensors, _, _ = self.run_internal_graph(inputs, masks)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\network.py", line 740, in run_internal_graph layer.call(computed_tensor, **kwargs))
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\convolutional.py", line 163, in call dilation_rate=self.dilation_rate[0])
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 3671, in conv1d **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 917, in convolution_v2 name=name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 969, in convolution_internal "Input tensor must be of rank 3, 4 or 5 but was {}.".format(n + 2))
ValueError: Input tensor must be of rank 3, 4 or 5 but was 2.
Would appreciate any help. Thank you!
Solution
You seem to have mixed up some of your variables and shapes.
According to your code, sent_model is a model receiving a sentence consisting of up to 30 characters, hence the input shape of (30, 28) (even though I am wondering why your sentences only have up to 30 characters, that seems to be not that much).
Your doc_model now is supposed to apply the sent_model to each sentence of the document. So the input shape of your document should be (num_sentences_per_doc, 30, 28).
Now to the problem: You pass the same input layer to both models. This does not make any sense, since the two models need to have different input layers. The shape of sentModelInput (the only input layer you defined) matches the input shape for your sent_model, but has too few dimensions for your doc_model, since this model needs a shape of (num_sentences_per_doc, 30, 28) as described earlier. So the fix to all of this would be the following:
- Use two different input layers for both models.
- Make sure that the input layers actually have the correct shape.
This altered code should give you an idea of what you need to do. If you want to use the flatten approach that is currently commented out, your documents will need to have a fixed number of sentences, so I will call this num_sents_per_doc. If you want to build a model that can handle documents of variable lengths, then you need to set the input shape of docInput to (None, 30, 28) and use a neural network structure that can handle this variable length input.
def buildSentModel():
sentModelInput = Input(shape=(30,28), dtype='float32')
layer = Conv1D(10,
5,
padding='valid',
activation='relu',
strides=1)(sentModelInput)
layer = Bidirectional(LSTM(units=12, return_sequences=False))(layer)
return Model(input=sentModelInput, output=layer)
def buildDocModel():
docInput = Input(shape=(num_sents_per_doc, 30,28), dtype='float32')
layer = TimeDistributed(sentModel)(docInput)
#layer = Flatten()(layer)
layer = Dense(3, activation='sigmoid')(layer)
return Model(input=docInput, output=layer)
sentModel = buildSentModel();
docModel = buildDocModel(sentModel);
docModel.compile(optimizer="adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
docModel.fit(x_train, y_train, steps_per_epoch=20,epochs=100, shuffle=True)
Answered By - Marc Felix
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.