Issue
I have a dataset and I have performed PCA analysis using scikit-learn
. I have another dataset with the same features and would like to project the data into the same PCA space as created by the first dataset.
My understanding is that I have to transform and center the data in the same way the original dataset was and then use the eigenvectors to rotate the data.
I'm a little stuck as to do this based on the output from the sklearn.decomposition.PCA library
.
So far I have
X1 = np.loadtxt(fname="dataset1.txt")
pca = PCA(n_components=50)
pca.fit_transform(X1)
pca_result = pca.transform(X1)
X2 = np.loadtxt(fname="dataset2.txt")
Does anyone have any pointers on how this can be achieved?
Solution
You have some redundancy there. If you perform fit_transform(), it returns the principal components while also saving the parameters to the object. If you have a new sample, you then use only transform. See below:
X1 = np.loadtxt(fname="dataset1.txt")
pca = PCA(n_components=50)
Y1 = pca.fit_transform(X1)
X2 = np.loadtxt(fname="dataset2.txt")
Y2 = pca.transform(X2)
Answered By - gtancev
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.