Issue
trf1=ColumnTransformer([("Infuse_val",SimpleImputer(strategy="mean"),[0])],remainder="passthrough")
trf4=ColumnTransformer([("One_hot",OneHotEncoder(sparse=False,handle_unknown="ignore"),[1,4])],remainder="passthrough")
trf2=ColumnTransformer([("Ord_encode",OrdinalEncoder(categories=["Strong","Mild"]),[3])],remainder="passthrough")
trf3=ColumnTransformer([("scale",StandardScaler(),[0,2])],remainder="passthrough")
pipe = Pipeline([
('trf1',trf1),
('trf2',trf2),
('trf3',trf3),
('trf4',trf4),
])
pipe.fit(x_train,y_tarin)
Error
ValueError: Shape mismatch: if categories is an array, it has to be of shape (n_features,).
The table is
I don't understand what's the error here in my code?
Solution
The error isn't about the column transformers, it's about the OrdinalEncoder
. categories
needs to be a list of lists: for each column, the list of categories in that column. Since you have just one column, categories=[["Strong","Mild"]]
should work.
With just two categories, most subsequent algorithms won't care which one is 0 or 1, so here you could just use the default auto
.
Finally, you'll have problems with your column transformers. The change the order (and names) of the columns, so by the end of the pipeline, scaling columns 0 and 2 might not be the two numeric columns. The column order is predictable (transformers in order followed by passthrough), so you could manually keep track. But I would suggest a single column transformer with multiple pipelines instead.
Answered By - Ben Reiniger
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.