Issue
I've been working with LSTMs to make stock price predictions for my machine learning class and I'm quite new to programming so I'd appreciate your patience!
Anyway, I have this function that generates an accuracy score and I'm trying to better understand some components of the function. Namely, what purpose are the data transformations and the lamda functions serving?
Here's the function:
def accuracy(model, data):
y_test = data["y_test"]
X_test = data["X_test"]
y_pred = model.predict(X_test)
y_test = np.squeeze(data["column_scaler"]["Close"].inverse_transform(np.expand_dims(y_test, axis=0)))
y_pred = np.squeeze(data["column_scaler"]["Close"].inverse_transform(y_pred))
y_pred = list(map(lambda current, future: int(float(future) > float(current)), y_test[:-LOOKUP_STEP], y_pred[LOOKUP_STEP:]))
y_test = list(map(lambda current, future: int(float(future) > float(current)), y_test[:-LOOKUP_STEP], y_test[LOOKUP_STEP:]))
return accuracy_score(y_test, y_pred)
I'm curious what the difference would be if I just used a function like this:
def accuracy(model, data):
y_test = data["y_test"]
X_test = data["X_test"]
y_pred = model.predict(X_test)
return accuracy_score(y_Test, y_pred)
Solution
Well, you should describe your data and model output better. I suppose the data is a pandas dataframe and you are using sklearn to preprocess your data.
First you need to normalize your data, so I suppose somewhere in your code you had a MinMaxScaler
or some sklearn transformation, that mapped an entire column of your dataframe to values between 0 and 1, from your code it seems it was the column_scaler
. So you have to un-normalize those values to be real (cash values) by using inverse_transform
. So a y_pred value of 0 becomes the lowest value you originally had on your data, suppose $10,29.
Then you are converting the price predictions to true-false values (boolean) checking if the price in the future is higher than the price in the present (looking LOOKUP_STEP
ticks ahead).
With this array that tells only if prices go up or not, you calculate the Jaccard Score (from sklearn I suppose) that just tells you how many UP prices you got right in relation to the ones you got wrong.
If you did not execute those post-processing steps the accuracy_score
would give a different value, I am not sure if it even accept floats as values, it would give an error.
Answered By - Gabriel A.
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.