Issue
I have trained a model using the Functional API and two different kind of pre-trained model: EfficientNet B5 and MobileNet V2. After tranining with the saved model, I'm running an application which uses that model to make some predictions.
I'm fronting a doubt relatated to what is the correct way to pass the images to "model.prediction()" arguments.
Model:
self.feature_extractor1 = EfficientNetB5(#weights='imagenet',
input_shape=self.input_shape,
include_top=False)
self.feature_extractor2 = MobileNetV2(#weights='imagenet',
input_shape=self.input_shape,
include_top=False)
for layer in self.feature_extractor1.layers:
layer.trainable = False
for layer in self.feature_extractor2.layers:
layer.trainable = False
input_ = Input(shape=self.input_shape)
processed_input1 = b5_preprocess_input(input_)
processed_input2 = mbv2_preprocess_input(input_)
x1 = self.feature_extractor1(processed_input1)
x1 = GlobalAveragePooling2D()(x1)
x1 = Dropout(0.2)(x1)
x1 = Flatten()(x1)
x2 = self.feature_extractor2(processed_input2)
x2 = GlobalAveragePooling2D()(x2)
x2 = Dropout(0.2)(x2)
x2 = Flatten()(x2)
x = Concatenate()([x1, x2])
x = Dense(512, activation='relu')(x) #,kernel_initializer=initializer,kernel_regularizer=regularizers.l2(0.001))
x = Dense(1024, activation='relu')(x)
output_shape = Dense(shape_categories, activation='softmax', name='shape')(x)
model = Model(inputs=input_,
outputs=output_shape)
adam_kwargs = {'beta_1': 0.9, 'beta_2': 0.9, 'epsilon': 1e-7}
sgd_kwargs = {'decay': 1e-6, 'momentum': 0.9, 'nesterov': True}
optimizer = self.optimizers(kwargs=adam_kwargs)
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
model.summary()
STEP_SIZE_TRAIN = self.phase_gen[0].n// self.phase_gen[0].batch_size
STEP_SIZE_VALID = self.phase_gen[1].n// self.phase_gen[1].batch_size
if self.phases == 3:
STEP_SIZE_TEST = self.phase_gen[2].n// self.phase_gen[2].batch_size
checkpoint = ModelCheckpoint(self.model_dir,
monitor='val_accuracy',
verbose=1,
save_best_only=True,
mode='max')
tensorboard = TensorBoard(log_dir=self.model_dir + '/logs',
histogram_freq=5,
embeddings_freq=5)
#[EarlyStopping(monitor='val_loss', patience=8)]
callbacks = [checkpoint, tensorboard]
hist = model.fit_generator(generator=self.phase_gen[0],
steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=self.phase_gen[1],
validation_steps=STEP_SIZE_VALID,
epochs=self.epochs,
callbacks=callbacks
)
In another script, I have the prediction method:
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input as mbv2_preprocess_input
from tensorflow.keras.applications.efficientnet import preprocess_input as b5_preprocess_input
def preprocess_image(img):
img = Image.open(io.BytesIO(img))
img = img.resize((224, 224), Image.ANTIALIAS)
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
#return [b5_preprocess_input(img), mbv2_preprocess_input(img)]
return [img, img]
modelSHP = get_modelSHP()
@app.route('/part_numbers', methods=['POST'])
def part_number():
img = request.files.get('image').read()
processed_image = preprocess_image(img)
predict_shape = modelSHP.predict(processed_image)
My first thought was that I would need to pass the input (image) pre processed by the correct function and in the same order I have used it during the model training. But when I have done it, my prediction accuracy stays around zero. Passing just the image, withouth any preprocessing, the results got better.
The way which I'm passing the image input to model.prediction is right (without preprocessing)? I was wondering if using the Functional API and in the way I built the model, the pre processing became such as a layer into each branch model.
Solution
I copied your code and then printed out the model summary as shown below
Model: "functional_5"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_23 (InputLayer) [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
tf.math.truediv_5 (TFOpLambda) (None, 224, 224, 3) 0 input_23[0][0]
__________________________________________________________________________________________________
tf.math.subtract_5 (TFOpLambda) (None, 224, 224, 3) 0 tf.math.truediv_5[0][0]
__________________________________________________________________________________________________
efficientnetb5 (Functional) (None, 7, 7, 2048) 28513527 input_23[0][0]
__________________________________________________________________________________________________
mobilenetv2_1.00_224 (Functiona (None, 7, 7, 1280) 2257984 tf.math.subtract_5[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_8 (Glo (None, 2048) 0 efficientnetb5[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_9 (Glo (None, 1280) 0 mobilenetv2_1.00_224[0][0]
__________________________________________________________________________________________________
dropout_8 (Dropout) (None, 2048) 0 global_average_pooling2d_8[0][0]
__________________________________________________________________________________________________
dropout_9 (Dropout) (None, 1280) 0 global_average_pooling2d_9[0][0]
__________________________________________________________________________________________________
flatten_8 (Flatten) (None, 2048) 0 dropout_8[0][0]
__________________________________________________________________________________________________
flatten_9 (Flatten) (None, 1280) 0 dropout_9[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate) (None, 3328) 0 flatten_8[0][0]
flatten_9[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 512) 1704448 concatenate_3[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 1024) 525312 dense_6[0][0]
__________________________________________________________________________________________________
shape (Dense) (None, 2) 2050 dense_7[0][0]
==================================================================================================
Total params: 33,003,321
Trainable params: 2,231,810
Non-trainable params: 30,771,511
As you postulated the preprocessing becomes layers in the model. So for predictions you do not have to preprocess the input as that is built into the model. For efficientNet the preprocessing function is simply a pass through as efficientnet expects input pixels in the range 0 to 255. So in the model summary you can see that the input (input_23) feeds directly into efficientnet. For MobileNet the preprocessing function scales the pixels between -1 and +1. That is done by the equation input pixels=pixel/127.5 - 1. So layer tf.math.truediv_5 divides the input_23 by 127.5 and then layer tf.math.subtract_5 subtracts 1.
Answered By - Gerry P
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.