Issue
I am implementing MLP Classifier where I want to give string as input.
df = pd.DataFrame(results)
X = df.iloc[:, [2]].values
y = df.iloc[:, [1]].values
X_train, X_test, y_train, y_test = train_test_split(X, y)
clf = MLPClassifier(random_state=6, max_iter=200).fit(X_train,
y_train.ravel())
clf.predict()
I am getting this error
Solution
Anyways, as you are using pandas
dataframe, you can do it more easily. For getting class label vector y
it is too straightforward. Say the column name is 'label':
y = df['label'].factorize()[0]
If you do not have column name, just use the column number (for your case df[1]
).
Wondering why I have taken [0]
in factorization? pandas.factorize will not only give you the codes
which we need here, but also it will give you the unique values of that column which are coded (uniques
).
Again if some input feature column from feature matrix X
is categorical (and non numeric), therefore encode it numerically. There are two types of encoding for categorical variables:
- Label encoding: If the values of that feature has an order or hierarchy then use this encoding. See here.
- One-hot encoding: If the values of that feature doesn't have any order or hierarchy, therefore use this encoding technique. See here.
Answered By - hafiz031
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.