Issue
I am splitting a dataset in train and test set using,
X_train, X_test, y_train, y_test = train_test_split(X.values, y.values, test_size = 0.20, random_state=99)
However the train & test sets have no column names and index names after the split. How to restore this?
Solution
X_names = X.columns
y_name = y.name
X_train, X_test, y_train, y_test = train_test_split(X.values, y.values, \
test_size = 0.20, random_state=99)
Store the names before performing the split (assuming that X and y are pandas objects). Alternatively, you don't even need to pass in X.values and y.values. Instead, you could just pass in X and y. This way you will maintain the pandas data structure.
Answered By - Zakariah Siyaji
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.