Issue
I am trying to fit an LSTM network to a dataset.
I have the following dataset:
0 17.6 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
1 38.2 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
2 39.4 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
3 38.7 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
4 39.7 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
17539 56.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
17540 51.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
17541 46.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
17542 44.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
17543 40.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 1.0 0.0 0.0
27 28 29 30 31 32 33
0 0.0 0.0 1.0 0.0 0.0 1.0 0.0
1 0.0 0.0 1.0 0.0 0.0 1.0 0.0
2 0.0 0.0 1.0 0.0 0.0 1.0 0.0
3 0.0 0.0 1.0 0.0 0.0 1.0 0.0
4 0.0 0.0 1.0 0.0 0.0 1.0 0.0
... ... ... ... ... ... ... ...
17539 0.0 0.0 0.0 0.0 1.0 0.0 1.0
17540 0.0 0.0 0.0 0.0 1.0 0.0 1.0
17541 0.0 0.0 0.0 0.0 1.0 0.0 1.0
17542 0.0 0.0 0.0 0.0 1.0 0.0 1.0
17543 0.0 0.0 0.0 0.0 1.0 0.0 1.0
with shape:
[17544 rows x 34 columns]
Then I scale it with MinMaxScaler as follows:
scaler = MinMaxScaler(feature_range=(0,1))
data = scaler.fit_transform(data)
Then I am using a function to create my train, test dataset with shapes:
X_train : (12232, 24, 34)
Y_train : (12232, 24)
X_test : (1708, 24, 34)
Y_test : (1708, 24)
After I fit the model and I predict the values for the test set, I need to scale back to the original values and I do the following:
test_predict = model.predict(X_test)
test_predict = scaler.inverse_transform(test_predict)
Y_test = scaler.inverse_transform(Y_test)
But I am getting the following error:
ValueError: operands could not be broadcast together with shapes (1708,24) (34,) (1708,24)
How can I resolve it?
Solution
The inverse transformation expects the data in the same shape with the one produced after the transform, i.e with 34 columns. This is not the case with your test_predict
, neither with your y_test
.
Additionally, although irrelevant to your error, you are committing the mistake of scaling first and splitting to train/test afterwards, which is not the correct methodology as it leads to data leakage.
Here are the necessary steps to resolve this:
- Split first to train & test sets
- Transform your
X_train
andy_train
using two different scalers for the features and output respectively, as I show in this answer of mine; you should use.fit_transform
here. - Fit your model with the transformed
X_train
andy_train
(side note: it is good practice to use different names for different versions of the data, instead of overwriting the existing ones). - To evaluate your model with the test data
X_test
&y_test
, first transform them using the respective scalers from step #2; you should use.transform
here (not.fit_transform
again). - In order to get your predictions
y_pred
back to the scale of your originaly_test
, you should use.inverse_transform
of the respective scaler on them. There is of course no need to inverse transform your transformedX_test
andy_test
- you already have these values!
Answered By - desertnaut
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.