Issue
I have a tensorflow model for predicting Timeseries values using LSTM, it trains fine but when I ask it to predict some values in time it only gives me the T+1 value,
How can I make it to give me from T+1 to T+n values instead of just T+1.
I thought about giving him back the predicted value to analyse again in a loop, e.g.
We look back at 20 samples for this example
T±0 = now value
T-k = value k steps into the past (known)
T+n = value n steps into the future (unkown at the start)
--- Algorithm
T+1 = model.predict(data from T-20 to T±0)
T+2 = model.predict(data from T-19 to T+1) #using the previously found T+1 value
T+3 = model.predict(data from T-18 to T+2) #using the previously found T+1 and T+2 values
.
.
.
T+n = model.predict(data from T-(k-n) to T+(n-1)) # using the previously found T+1 .. T+(n-1) values
The thing is that T+1 has an mean absolute error of around 0.75%, doesn't the error propagate/compound through the predictions ? If it does it means that if I ask the program to predict T+10 it will have a mean absolute error of 0.75%^10 = ~7.7%, which is not very good in my case. So I'm looking for other ways to predict up to T+n values.
I've looked at few youtube tutorials but each time it seems that their call of model.predict(X) returns multiple values already, and I have no idea about what parameters I could have missed.
Code :
import tensorflow.keras as tfks
import pandas as pd
import numpy as np
def model_training(dataframe,folder_name, window_size=40, epochs=100, batch_size=64):
"""Function to start training the model on given data
Parameters
----------
dataframe : `pandas.DataFrame` The dataframe to train the model on
window_size : `int` The size of the lookback window to use
epochs : `int` The number of epochs to train for
batch_size : `int` The batch size to use for training
Returns
-------
None
"""
dataframe,_ = Process.pre(dataframe) #function to standardize each column of the data
TRAIN_SIZE = 0.7
VAL_SIZE = 0.2
TEST_SIZE = 0.1
#Splitting the data into train, validation and test sets
x,y = dataframe_to_xy(dataframe, window_size) #converts pandas dataframe to numpy array
x_train,y_train = x[:int(len(dataframe)*TRAIN_SIZE)],y[:int(len(dataframe)*TRAIN_SIZE)]
x_val,y_val = x[int(len(dataframe)*TRAIN_SIZE):int(len(dataframe)*(TRAIN_SIZE+VAL_SIZE))],y[int(len(dataframe)*TRAIN_SIZE):int(len(dataframe)*(TRAIN_SIZE+VAL_SIZE))]
x_test,y_test = x[int(len(dataframe)*(TRAIN_SIZE+VAL_SIZE)):],y[int(len(dataframe)*(TRAIN_SIZE+VAL_SIZE)):]
#Creating the model base
model = tfkr.models.Sequential()
model.add(tfkr.layers.InputLayer(input_shape=(window_size, 10)))
model.add(tfkr.layers.LSTM(64))
model.add(tfkr.layers.Dense(8, 'relu'))
model.add(tfkr.layers.Dense(10, 'linear'))
model.summary()
#Compiling and saving the model
cp = tfkr.callbacks.ModelCheckpoint('ai\\models\\'+folder_name+'\\', save_best_only=True)
model.compile(loss=tfkr.losses.MeanSquaredError(), optimizer=tfkr.optimizers.Adam(learning_rate=0.0001), metrics=[tfkr.metrics.RootMeanSquaredError()])
model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size, validation_data=(x_val, y_val), callbacks=[cp])
def predict_data(model,data_pre,window_size,n_future):
'''Function to predict data using the model
Parameters
----------
model : `tensorflow.keras.models.Sequential` The model to use for prediction
data_pre : `pandas.DataFrame` The dataframe to predict on
window_size : `int` The size of the lookback window to use
n_future : `int` Number of values to predict in the future
Returns
-------
data_pred : `pandas.DataFrame` The dataframe containing the predicted values
'''
time_interval = data_pre.index[1] - data_pre.index[0]
#Setting up the dataframe to predict on
data_pre, proc_params = Process.pre(data_pre) #function to standardize each column of the data
data_pred = data_pre.iloc[-window_size:]
data_pred = data_pred.to_numpy().astype('float32')
data_pred = data_pred.reshape(1,window_size,10)
#Predicting the data
data_pred = model.predict(data_pred)
#Converting the data from numpy array to pandas dataframe and doing some formatting + post-processing/reversing standardization
#yada yada pandas dataframe
return data_pred
If there is no way preventing the error propagation, do you have any tips for reducting the error of the model ?
Thanks in advance.
Solution
You could let your LSTM return the full sequences, not only the last output, like this:
model = tfkr.models.Sequential()
model.add(tfkr.layers.InputLayer(input_shape=(window_size, 10)))
model.add(tfkr.layers.LSTM(64, return_sequences=True))
model.add(tfkr.layers.Dense(8, 'relu'))
model.add(tfkr.layers.Dense(10, 'linear'))
Each LSTM output would go through the same dense weights (because we did not flatten). Your output would be then of shape
(None, window_size, 10)
This means that for k input time points you would get k output time points.
Downside is, that the first output would be calculated only using the first input. The second output only by using the first two inputs, and so on. So I would suggest to use a bidirectional LSTM, and combine both directions, maybe like this:
model = tfkr.models.Sequential()
model.add(tfkr.layers.InputLayer(input_shape=(window_size, 10)))
model.add(tfkr.layers.Bidirectional(tfkr.layers.LSTM(64, return_sequences=True), merge_mode='sum')
model.add(tfkr.layers.Dense(8, 'relu'))
model.add(tfkr.layers.Dense(10, 'linear'))
Answered By - AndrzejO
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.