Issue
Can someone help me understand a bit better this problem? I must train a neural network which should output 200 mutually independent categories, each of these categories is a percentage ranging from 0 to 1. This seems to me like a binary_crossentropy
problem, but every example I see on the internet uses binary_crossentropy
with a single output. Since my output should be 200, if I apply binary_crossentropy
, would that be correct?
This is what I have in mind, is that a correct approach or should I change it?
inputs = Input(shape=(input_shape,))
hidden = Dense(2048, activation='relu')(inputs)
hidden = Dense(2048, activation='relu')(hidden)
output = Dense(200, name='output_cat', activation='sigmoid')(hidden)
model = Model(inputs=inputs, outputs=[output])
loss_map = {'output_cat': 'binary_crossentropy'}
model.compile(loss=loss_map, optimizer="sgd", metrics=['mae', 'accuracy'])
Solution
For multiple category classification problems, you should use categorical_crossentropy
rather than binary_crossentropy
. With this, when your model classifies an input, it is going give a dispersion of probabilities between all 200 categories. The category that receives the highest probability will be the output for that particular input.
You can see this when you call model.predict()
. If you were to call this function only on one input, for example, and print the results, you will see a result of 200 percentages (in total summing to 1). The hope is that one of those 200 percentages would be vastly higher than the others, which signals that the model thinks that there is a strong probability that this is the correct output (category) for this particular input.
This video may help clarify the prediction piece. Printing out the predictions starts around 3:17, but to get the full context, you'll need to start from the beginning.
Answered By - blackHoleDetector
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.