Issue
I have a pipeline in a ColumnTransformer. One of the transformers is a PCA. When i use fit and then transform, the data looks right and everything is working. But when i try to acces the explained_variance_ratio_ of the PCA in the pipeline after the fit, the attribute does not exists. All my other transformers in the pipeline are missing their attributes too that they should have after fitting. What am i doing wrong?
The code looks like this:
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.decomposition import PCA
import pandas as pd
def transform(df: pd.DataFrame, cat_cols, log_cols, passthrough_cols):
oh_enc = OneHotEncoder(handle_unknown='ignore')
transformer_oh = ColumnTransformer([('cat_cols', oh_enc, cat_cols)], remainder='passthrough')
scaler = StandardScaler()
pca = PCA(n_components=5)
pipe = Pipeline([("preprocessing", transformer_oh),
("scaling", scaler),
("pca", pca)
])
to_transform = list(set(df.columns) - set(passthrough_cols))
transformer = ColumnTransformer([("pipe", pipe, to_transform)], remainder='passthrough')
transformer = transformer.fit(df)
pca2=transformer.transformers[0][1].steps[2][1]
print(pca2.explained_variance_ratio_) #AttributeError: 'PCA' object has no attribute 'explained_variance_ratio_'
Solution
To access the fitted transformers in a fitted ColumnTransformer you have to use the attribute transformers_
and not transformers
. By changing that everything works fine.
Answered By - HrkBrkkl
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.