Issue
When I retrain a binary classification model after training a model once, the model always converges to predicting 0.5. Initially, after the first epoch, the output of the model is already predicting the same value (0.68), before slowly trending towards 0.5.
The model is loaded through keras.models.load_model("oldModel.h5")
and cloned to keep the old model around for use through keras.models.clone_model(model)
.
The dataset is loaded in exactly the same way during the retraining as it was in the inital training run, through 1. loading filenames and labels into pandas series, 2. loading into tensors, 3. loading the images through a partial call, with caching and prefetching to help training performance.
Besides decreasing the learning rate, all other parameters such as trainable layers and loss functions are kept constant throughout both training processes. Trainable layers are the same, and only consist of the top few layers of the model.
I have checked the training dataset to check if the output is correct, and it is as expected.
Code to load dataset
X_train_file = self.df[self.df.subset=="train"].fileName
Y1 = self.df[self.df.subset=="train"].meanElicat
Y2 = self.df[self.df.subset=="train"].finalY.astype(int)
X_test_file = self.df[self.df.subset=="val"].fileName
Y1_test = self.df[self.df.subset=="val"].meanElicat
Y2_test = self.df[self.df.subset=="val"].finalY.astype(int)
train_image_paths = tf.convert_to_tensor(X_train_file, dtype=tf.string)
train_Y1 = tf.convert_to_tensor(Y1)
train_Y2 = tf.convert_to_tensor(Y2)
test_image_paths = tf.convert_to_tensor(X_test_file, dtype=tf.string)
test_Y1 = tf.convert_to_tensor(Y1_test)
test_Y2 = tf.convert_to_tensor(Y2_test)
train = tf.data.Dataset.from_tensor_slices( ( train_image_paths, (train_Y1,train_Y2) ) )
test = tf.data.Dataset.from_tensor_slices( ( test_image_paths, (test_Y1,test_Y2) ) )
def map_fn(path, label):
image = tf.io.decode_jpeg(tf.io.read_file(path))
image = tf.image.resize(image, [300, 300])
return image, label
self.train_ds = train.map(partial(map_fn), num_parallel_calls=tf.data.experimental.AUTOTUNE).batch(16).cache().prefetch(tf.data.experimental.AUTOTUNE)
self.val_ds = test.map(partial(map_fn), num_parallel_calls=tf.data.experimental.AUTOTUNE).batch(16).cache().prefetch(tf.data.experimental.AUTOTUNE)
Code to train and fit the model
self.model.compile(tf.keras.optimizers.Adam(eval(lr)),
loss = {'classification': tf.keras.losses.BinaryCrossentropy()},
metrics = {'classification': ["accuracy", tf.keras.metrics.AUC(),]})
filepath = os.path.join("weights", f"{str(weightName)}_{blocks:02d}_CW[{cw}]" +
".E[{epoch}]_{val_loss:.4f}.h5")
callbacks = [
CSVLogger(weights,blocks,cw, self.logName),
tf.keras.callbacks.EarlyStopping(patience=patience),
tf.keras.callbacks.ModelCheckpoint(filepath,
save_best_only=True,
save_weights_only=False)]
try:
self.model.fit(self.train_ds, epochs=epoch, callbacks=callbacks,
validation_data=self.val_ds,
verbose = 2)
except Exception as e:
print(traceback.format_exc())
Output of retrained model after 1 epoch
array([[0.6818546 ],
[0.6817692 ],
[0.68143094],
[0.6824522 ],
[0.6820409 ],
[0.6816176 ],
[0.68077767],
[0.68115866],
...
Output of retrained model after 11 epochs
array([[0.4997447 ],
[0.49965417],
[0.5004351 ],
[0.49858376],
[0.49974793],
[0.500144 ],
[0.50129014],
[0.5004081 ],
...
Output of model chosen for retraining after first training process
array([[0.01635163],
[0.8146548 ],
[0.08911347],
[0.03006527],
[0.04414936],
...
[0.8874662 ],
[0.37499326],
[0.98350084],
[0.9966594 ],
[0.09798203]], dtype=float32)
Thank you for any help rendered!
Solution
The issue was from keras.clone_model(model)
which requires the input tensor to be specified. To save on time I just read the model anytime I wanted to use it.
Answered By - vernal123
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.