Issue
I want to apply sample weights and at the same time use a pipeline from sklearn which should make a feature transformation, e.g. polynomial, and then apply a regressor, e.g. ExtraTrees.
I am using the following packages in the two examples below:
from sklearn.ensemble import ExtraTreesRegressor
import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
Everything works well as long as I seperately transform the features and generate and train the model afterwards:
#Feature generation
X = np.random.rand(200,4)
Y = np.random.rand(200)
#Feature transformation
poly = PolynomialFeatures(degree=2)
poly.fit_transform(X)
#Model generation and fit
clf = ExtraTreesRegressor(n_estimators=5, max_depth = 3)
weights = [1]*100 + [2]*100
clf.fit(X,Y, weights)
But doing it in a pipeline, does not work:
#Pipeline generation
pipe = Pipeline([('poly2', PolynomialFeatures(degree=2)), ('ExtraTrees', ExtraTreesRegressor(n_estimators=5, max_depth = 3))])
#Feature generation
X = np.random.rand(200,4)
Y = np.random.rand(200)
#Fitting model
clf = pipe
weights = [1]*100 + [2]*100
clf.fit(X,Y, weights)
I get the following error: TypeError: fit() takes at most 3 arguments (4 given) In this simple example, it is no issue to modify the code, but when I want to run several different tests on my real data in my real code, being able to use pipelines and sample weight
Solution
There is mention of **fit_params
in the fit
method of Pipeline
documentation. You must specify which step of the pipeline you want to apply the parameter to. You can achieve this by following the naming rules in the docs:
For this, it enables setting parameters of the various steps using their names and the parameter name separated by a ‘__’, as in the example below.
So all that being said, try changing the last line to:
clf.fit(X,Y, **{'ExtraTrees__sample_weight': weights})
Updated link: This is a good example of how to work with parameters in pipelines.
Answered By - Kevin
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.