Issue
I have tried to train the CNN model with about 700 images and 35 classes with the lines of code below, don't know where I am wrong and how when I finish training I can print and check the results. I have consulted the ways but there are only 2 classes and use the if-else form, so I don't know how it will work with many classes
model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(16,(3,3),activation = 'relu',input_shape = (200,200,3)),
tf.keras.layers.MaxPool2D(2,2),
#
tf.keras.layers.Conv2D(32,(3,3), activation = 'relu'),
tf.keras.layers.MaxPool2D(2,2),
#
tf.keras.layers.Conv2D(64,(3,3), activation = 'relu'),
tf.keras.layers.MaxPool2D(2,2),
#
tf.keras.layers.Conv2D(128,(3,3), activation = 'relu'),
tf.keras.layers.MaxPool2D(2,2),
#
tf.keras.layers.Conv2D(256,(3,3), activation = 'relu'),
tf.keras.layers.MaxPool2D(2,2),
##
tf.keras.layers.Flatten(),
##
tf.keras.layers.Dense(512, activation = 'relu', name ='layer1'),
##
tf.keras.layers.Dense(1, activation = 'sigmoid')
])
model.compile(loss = 'binary_crossentropy',
optimizer = RMSprop(learning_rate=0.001),
metrics = ['accuracy'])
model_fit= model.fit(train_dataset,
steps_per_epoch = 16,
epochs = 100,
)
Solution
For multi-class classification you should use a softmax
activation function. The softmax function normalizes a set of N real numbers into a probability distribution such that they sum up to 1. For K = 2, the softmax and sigmoid function are the same.
So the problem is that you are using a sigmoid
activation function instead of a softmax
. What is happening is that even if you think you are training 35 different classes, since you set a binary loss and sigmoid activation, the network is trained to recognize 34 of your classes as one label, and the remaining class as the other label, which is clearly not what you want.
I suggest you to replace the last piece of code that you have with something like this:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
model.compile(optimizer = RMSprop(learning_rate=0.001),
loss=loss_fn,
metrics=['accuracy'])
And to use activation = 'softmax'
instead of 'sigmoid'
on your last Dense layer.
Then, to make a prediction (to make the model infer the class of your test samples) you simply do:
predictions = model.predict(x_test)
Answered By - claudia
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.