Issue
I am trying to implement a Bidirectional LSTM for a sequence-to-sequence model. I have already one-hot-encoded my sequences with 12 total features. The input is 11 steps while the output is 23 steps. First, I coded this LSTM implementation that works with the first LSTM as the encoder and the second as the decoder.
model = Sequential()
model.add(LSTM(75, input_shape=(11, 12)))
model.add(RepeatVector(23))
model.add(LSTM(50, return_sequences=True))
model.add(TimeDistributed(Dense(12, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
X, y = generate_data(1, taskset, trainset)
model.fit(X, y, epochs=1, batch_size=32, verbose=1)
I then tried to turn this into a bidirectional LSTM as follows:
model = Sequential()
model.add(Bidirectional(LSTM(75, return_sequences=True), input_shape=(11,12), merge_mode='concat'))
model.add(Bidirectional(LSTM(50, return_sequences=True)))
model.add(TimeDistributed(Dense(12, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics= ['accuracy'])
model.summary()
X, y = generate_data(1, taskset, trainset)
model.fit(X, y, epochs=1, batch_size=32, verbose=1)
The goal is to use the first bidirectional LSTM as the encoder and the second bidirectional LSTM as the decoder. I removed the RepeatVector in the bidirectional implementation because it gave me a dimension error (needed dim=2, received dim=3). With the current bidirectional LSTM I am getting this error:
ValueError: Shapes (None, 23, 12) and (None, 11, 12) are incompatible
Any help with fixing the bidirectional LSTM implementation?
Solution
Simply setting return_sequences=False
in your first bidirectional LSTM and adding as before RepeatVector(23)
works fine
n_sample = 10
X = np.random.uniform(0,1, (n_sample, 11, 12))
y = np.random.randint(0,2, (n_sample, 23, 12))
model = Sequential()
model.add(Bidirectional(LSTM(75), input_shape=(11,12), merge_mode='concat'))
model.add(RepeatVector(23))
model.add(Bidirectional(LSTM(50, return_sequences=True)))
model.add(TimeDistributed(Dense(12, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics= ['accuracy'])
model.fit(X, y, epochs=3, batch_size=32, verbose=1)
Answered By - Marco Cerliani
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.