Issue
From the documentation I already read that:
A FunctionTransformer forwards its X (and optionally y) arguments to a user-defined function or function object and returns the result of this function. This is useful for stateless transformations such as taking the log of frequencies, doing custom scaling, etc.
However, I don't understand what use this function has. Could anybody explain the purpose of this function?
Solution
In addition to simply wrapping a given user-defined function, the FunctionTransformer provides some standard methods of other sklearn estimators (e.g., fit
and transform
). The benefit of this is that you can introduce arbitrary, stateless transforms into an sklearn Pipeline, which combines multiple processing stages. This makes executing a processing pipeline easier because you can simply pass your data (X
) to the fit
and transform
methods of the Pipeline
object without having to explicitly apply each stage of the pipeline individually.
Here is an example copied directly from the sklearn documentation (located here):
def all_but_first_column(X):
return X[:, 1:]
def drop_first_component(X, y):
"""
Create a pipeline with PCA and the column selector and use it to
transform the dataset.
"""
pipeline = make_pipeline(
PCA(), FunctionTransformer(all_but_first_column),
)
X_train, X_test, y_train, y_test = train_test_split(X, y)
pipeline.fit(X_train, y_train)
return pipeline.transform(X_test), y_test
Note that the first principal component wasn't explicitly removed from the data. The pipeline automatically chains the transformations together when pipeline.transform
is called.
Answered By - bogatron
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.