Issue
I have a fairly large datframe(300 columns) and I'm using sklearn to encode/scale some fields, I like that I can choose the specific columns I want and then it drop the rest. My problem is, now I have two numpy arrays in two columns in my large data frame that I would like passed through while the others I don't list in the sklearn pipeline are dropped.
For example:
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer([("Country", OneHotEncoder(), [1])], remainder = 'passthrough')
This would convert the country to onehot and pass through everything. What if I have a column called "numpy_array" how can I get that one only passed through?
Solution
What if I have a column called "numpy_array" how can I get that one only passed through?
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer(
transformers=[
('np_array_transform', 'passthrough', ['numpy_array']),
],
remainder='drop',
)
Answered By - Sanjar Adylov
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.