Tuesday, November 16, 2021

[FIXED] Trying to understand the accuracy() function being applied to LSTM data in this code

November 16, 2021 lstm, numpy, python, tensorflow No comments

Issue

I've been working with LSTMs to make stock price predictions for my machine learning class and I'm quite new to programming so I'd appreciate your patience!

Anyway, I have this function that generates an accuracy score and I'm trying to better understand some components of the function. Namely, what purpose are the data transformations and the lamda functions serving?

Here's the function:

  def accuracy(model, data):
        y_test = data["y_test"]
        X_test = data["X_test"]
        y_pred = model.predict(X_test)
        y_test = np.squeeze(data["column_scaler"]["Close"].inverse_transform(np.expand_dims(y_test, axis=0)))
        y_pred = np.squeeze(data["column_scaler"]["Close"].inverse_transform(y_pred))
        y_pred = list(map(lambda current, future: int(float(future) > float(current)), y_test[:-LOOKUP_STEP], y_pred[LOOKUP_STEP:]))
        y_test = list(map(lambda current, future: int(float(future) > float(current)), y_test[:-LOOKUP_STEP], y_test[LOOKUP_STEP:]))
        return accuracy_score(y_test, y_pred)

I'm curious what the difference would be if I just used a function like this:

  def accuracy(model, data):
        y_test = data["y_test"]
        X_test = data["X_test"]
        y_pred = model.predict(X_test)
        return accuracy_score(y_Test, y_pred)

Solution

Well, you should describe your data and model output better. I suppose the data is a pandas dataframe and you are using sklearn to preprocess your data.

First you need to normalize your data, so I suppose somewhere in your code you had a MinMaxScaler or some sklearn transformation, that mapped an entire column of your dataframe to values between 0 and 1, from your code it seems it was the column_scaler. So you have to un-normalize those values to be real (cash values) by using inverse_transform. So a y_pred value of 0 becomes the lowest value you originally had on your data, suppose $10,29.

Then you are converting the price predictions to true-false values (boolean) checking if the price in the future is higher than the price in the present (looking LOOKUP_STEP ticks ahead).

With this array that tells only if prices go up or not, you calculate the Jaccard Score (from sklearn I suppose) that just tells you how many UP prices you got right in relation to the ones you got wrong.

If you did not execute those post-processing steps the accuracy_score would give a different value, I am not sure if it even accept floats as values, it would give an error.

Answered By - Gabriel A.

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, November 16, 2021

[FIXED] Trying to understand the accuracy() function being applied to LSTM data in this code

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels