Friday, December 22, 2023

[FIXED] TypeError: fit_transform() missing argument: y when using ColumnTransformer

December 22, 2023 python, scikit-learn No comments

Issue

I have two pipelines, one for my categorical features and one for my numeric features, that I feed into my column transformer. I then what to be able to fit the column transformer on my dataframe so I can see what it looks like.

My code is as follows:

num_pipeline = Pipeline(steps=[
    ('impute', RandomSampleImputer()),
    ('scale',MinMaxScaler())
])
cat_pipeline = Pipeline(steps=[
    ('impute', RandomSampleImputer()),
    ('target',TargetEncoder())
])

col_trans = ColumnTransformer(transformers=[
    ('num_pipeline',num_pipeline,num_cols),
    ('cat_pipeline',cat_pipeline,cat_cols)
    ],remainder=drop)

When I run

df_transform=col_trans.fit(df)

I get the error:

raise TypeError('fit_transform() missing argument: ''y''')'

Why is this?

Solution

As Guilherme Marthe and Luca Anzalone have pointed out, some transformers such as TargetEncoder do indeed require the target variable y to calculate the transformations.

In order to get your transformed dataset, you need to call fit_transform() on your ColumnTransformer col_trans, passing both X (your features) and y (your target).

When you call fit_transform(), the fit() method will first calculate any parameters needed for the transformation (such as the mean and standard deviation for normalization), and then transform() will apply the transformations to your data. The result is a new dataset where the transformations have been applied.

To ensure your output is a pandas DataFrame, you can use the set_config() function from scikit-learn to change the global configuration:

from sklearn import set_config
set_config(transform_output="pandas")

Now, when you transform your data, the output will be a pandas DataFrame:

X_transformed = col_trans.fit_transform(X, y)

Note that X_transformed is now a DataFrame with the column names preserved.

Please remember to update scikit-learn to version 1.2 or later to use this feature.

Answered By - DataJanitor

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, December 22, 2023

[FIXED] TypeError: fit_transform() missing argument: y when using ColumnTransformer

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels