Sunday, February 6, 2022

[FIXED] Why can't sklearn MLPClassifier predict xor?

February 06, 2022 machine-learning, neural-network, python, scikit-learn, tensorflow No comments

Issue

In theory, an MLP with a single hidden layer with just 3 neurons is enough to predict xor correctly. It could sometimes fail to converge properly, but 4 neurons are a safe bet.

Here's an example

I've tried to reproduce this using sklearn.neural_network.MLPClassifier:

from sklearn import neural_network
from sklearn.metrics import accuracy_score, precision_score, recall_score
import numpy as np


x_train = np.random.uniform(-1, 1, (10000, 2))
tmp = x_train > 0
y_train = 2 * (tmp[:, 0] ^ tmp[:, 1]) - 1

model = neural_network.MLPClassifier(
    hidden_layer_sizes=(3,), n_iter_no_change=100,
    learning_rate_init=0.01, max_iter=1000
).fit(x_train, y_train)

x_test = np.random.uniform(-1, 1, (1000, 2))
tmp = x_test > 0
y_test = 2 * (tmp[:, 0] ^ tmp[:, 1]) - 1

prediction = model.predict(x_test)
print(f'Accuracy: {accuracy_score(y_pred=prediction, y_true=y_test)}')
print(f'recall: {recall_score(y_pred=prediction, y_true=y_test)}')
print(f'precision: {precision_score(y_pred=prediction, y_true=y_test)}')

I only get around 0.75 accuracy, while the tensorflow playground model is perfect, any idea what makes the difference?

Tried also using tensorflow:

model = tf.keras.Sequential(layers=[
    tf.keras.layers.Input(shape=(2,)),
    tf.keras.layers.Dense(4, activation='relu'),
    tf.keras.layers.Dense(1)
])

model.compile(loss=tf.keras.losses.binary_crossentropy)

x_train = np.random.uniform(-1, 1, (10000, 2))
tmp = x_train > 0
y_train = (tmp[:, 0] ^ tmp[:, 1])

model.fit(x=x_train, y=y_train)

x_test = np.random.uniform(-1, 1, (1000, 2))
tmp = x_test > 0
y_test = (tmp[:, 0] ^ tmp[:, 1])

prediction = model.predict(x_test) > 0.5
print(f'Accuracy: {accuracy_score(y_pred=prediction, y_true=y_test)}')
print(f'recall: {recall_score(y_pred=prediction, y_true=y_test)}')
print(f'precision: {precision_score(y_pred=prediction, y_true=y_test)}')

With this model I get similar results to the scikit-learn model... So it's not just a scikit-learn issue - am I missing some important hyper-parameter?

Edit

Ok, changed the loss to mean squared error instead of cross-entropy, and now I get with the tensorflow example 0.92 accuracy. I guess that's the problem with the MLPClassifier?

Solution

Increasing the learning rate and/or maximum iterations seems to make the sklearn version work. Probably different solvers need different values for these, and it's not clear to me what the tf playground is using.

Answered By - Ben Reiniger

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, February 6, 2022

[FIXED] Why can't sklearn MLPClassifier predict xor?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels