Issue
I am beginner in RNNs
and would like to build a running model gated recurrent unit GRU
for stock prediction.
I have a numpy array for the training data with this shape:
train_x.shape
(1122,20,320)
`1122` represents the total amount timestamps I have `20` is the amount of timestamps I want to predict the future from `320` is the number of features (different stocks)
My train_y.shape
is (1122,) and represents a binary variable with 1
and 0
. 1
is a buy 0
is a sell.
With that in my mind I started to attempt my GRU
model as:
def GRU_model(train_x,train_y,test_x,test_y):
model = Sequential()
model.add(layers.Embedding(train_x.shape[0],50,input_length=320))
model.add(layers.GRU(50, return_sequences=True,input_shape=(train_x.shape[1],1),activation='tanh'))
model.add(layers.GRU(50, return_sequences=True,input_shape=(train_x.shape[1],1),activation='tanh'))
model.add(layers.GRU(50, return_sequences=True,input_shape=(train_x.shape[1],1),activation='tanh'))
model.add(layers.GRU(50,activation='tanh'))
model.add(Dense(units=2))
model.compile(optimizer=SGD(lr=0.01,decay=1e-7,momentum=0.9,nesterov=False),loss='mean_squared_error')
model.fit(train_x,train_y,epochs=EPOCHS,batch_size=BATCH_SIZE)
GRU_predict = model.predict(validation_x)
return model,GRU_predict
my_gru_model,my_gru_predict = GRU_model(train_x,train_y,validation_x,validation_y)
ValueError: Input 0 of layer gru_42 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 20, 320, 50)
Clearly my input dimensions into the model are incorrect, but I do not understand how they should fit in, so the model can run smoothly.
Solution
So if you have 1122 data samples and each sample has 20 time steps and each time step has 320 features and you want to teach your model to make a binary decision between buying and selling, try something like this:
import tensorflow as tf
tf.random.set_seed(1)
model = tf.keras.Sequential()
model.add(tf.keras.layers.GRU(50, return_sequences=True, input_shape=(20, 320), activation='tanh'))
model.add(tf.keras.layers.GRU(50,activation='tanh'))
model.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.01,decay=1e-7,momentum=0.9,nesterov=False),loss='binary_crossentropy')
print(model.summary())
train_x = tf.random.normal((1122, 20, 320))
train_y = tf.random.uniform((1122,), maxval=2, dtype=tf.int32)
model.fit(train_x, train_y, epochs=5, batch_size=16)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
gru (GRU) (None, 20, 50) 55800
gru_1 (GRU) (None, 50) 15300
dense (Dense) (None, 1) 51
=================================================================
Total params: 71,151
Trainable params: 71,151
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/5
71/71 [==============================] - 5s 21ms/step - loss: 0.7050
Epoch 2/5
71/71 [==============================] - 2s 22ms/step - loss: 0.6473
Epoch 3/5
71/71 [==============================] - 1s 21ms/step - loss: 0.5513
Epoch 4/5
71/71 [==============================] - 1s 21ms/step - loss: 0.3640
Epoch 5/5
71/71 [==============================] - 1s 20ms/step - loss: 0.1258
<keras.callbacks.History at 0x7f4eac87e610>
Note that you have a single output node because your model is supposed to make a binary decision. This is also the reason why you have to use the loss function binary_crossentropy
.
Regarding the GRU layer, it expects an input with the shape (batch_size, timesteps, features)
, but the batch_size is inferred during training and is therefore omitted in the input_shape
. Since the next GRU also requires this shape, you use the parameter return_sequences=True
in the first GRU, which returns a sequence with the shape (batch_size, 20, 50)
=> one hidden state output 50
for each input time step n
. Also you do not need an Embedding
layer in your case. It is usually used to map integer sequences representing text into n
-dimensional vector representations.
Answered By - AloneTogether
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.