Issue
Im trying to scale a dataset with multiple features and time-series data using the scikit-learn standardscaler. At the moment I am creating a seperate scaler for every feature:
scale_feat1 = StandardScaler().fit(data[:,:,0])
scale_feat2 = StandardScaler().fit(data[:,:,1])
..
Is there a way to scale all features separately using one scaler? Also what is the easiest way to save a scaler for all features and apply it to a valdidation dataset?
Edit: Standardscaler only works on 2D Arrays, so the array would have to be flattened for scaling. In 2D Standardscaler creates a seperate mean and std-dev for every feature
Solution
Assuming that your data is shaped [num_instances, num_time_steps, num_features]
what I would do is first reshape the data and then normalize the data.
import numpy as np
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
num_instances, num_time_steps, num_features = train_data.shape
train_data = np.reshape(train_data, shape=(-1, num_features))
train_data = scaler.fit_transform(train_data)
This will reshape the data in a format where each feature is one column and it will normalize each feature separately. Afterwards, you can just return the data in the same shape before training.
train_data = np.reshape(train_data, shape=(num_instances, num_time_steps, num_features))
When it comes to the using the scaler on the validation set, the fit_transform
method computes the mean
and std
on the train set and stores them in the object. Then, when you want to normalize the validation set you can do:
num_instances, num_time_steps, num_features = val_data.shape
val_data = np.reshape(val_data, shape=(-1, num_features))
val_data = scaler.transform(val_data)
And afterwards reshape the data in the shape that you need for training.
val_data = np.reshape(val_data, shape=(num_instances, num_time_steps, num_features))
This should do the trick for you.
Update:
As per @Medomatto comment, in the later numpy
versions the correct way to reshape would be:
... = np.reshape(data, newshape=(...))
Answered By - gorjan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.