Issue
I'm training two autoencoders for a Deepfake, and it needs to go through a round of 150,000 epochs. I stopped it at 10,000 but I want to it be able resume training from the epoch it left off on. Is there a way do so that?
train_setA = video.loading_images(setA_path)/255.0
train_setB = video.loading_images(setB_path)/255.0
train_setA += train_setB.mean( axis=(0,1,2) ) - train_setA.mean( axis=(0,1,2) )
batch_size = int(len(os.listdir(setA_path))/20)
print( "press 'q' to stop training and save model" )
for epoch in range(1000000):
batch_size = 64
warped_A, target_A = train_util.training_data( train_setA, batch_size )
warped_B, target_B = train_util.training_data( train_setB, batch_size )
loss_A = aeA.train_on_batch( warped_A, target_A )
loss_B = aeB.train_on_batch( warped_B, target_B )
print( loss_A, loss_B )
print('Current epoch no... ' + str(epoch))
if epoch % 100 == 0:
save_model_weights()
print('Model weights saved')
test_A = target_A[0:14]
test_B = target_B[0:14]
figure_A = np.stack([
test_A,
aeA.predict( test_A ),
aeB.predict( test_A ),
], axis=1 )
figure_B = np.stack([
test_B,
aeB.predict( test_B ),
aeA.predict( test_B ),
], axis=1 )
figure = np.concatenate( [ figure_A, figure_B ], axis=0 )
figure = figure.reshape( (4,7) + figure.shape[1:] )
figure = train_util.stack_images( figure )
figure = np.clip( figure * 255, 0, 255 ).astype('uint8')
cv2.imshow( "", figure )
key = cv2.waitKey(1)
if key == ord('q'):
save_model_weights()
exit()
Solution
I will tell you in more detail what I know about this topic in 'Keras'
If you save weights after each epoch (for example, ModelCheckpoint), then you can load the saved weights.
For example:
Save:
weight_save_callback = ModelCheckpoint('/path/to/weights.{epoch:02d}-{val_loss:.2f}.hdf5', monitor='val_loss', save_best_only=False) # or True(Best result)
model.fit(X_train,y_train,batch_size=batch_size,nb_epoch=nb_epoch,callbacks=[weight_save_callback])
Load:
model = Sequential()
model.add(...)
model.load('path/to/weights.hf5')
It is important that the models are the same.
Since in some optimizers some of their internal values (for example, the learning rate
) are set using the current 'epoch' value, or even you may have (custom) callbacks that depend on the current epoch
, initial_epoch
allows you to specify the initial epoch
value to start with when training.
This is mainly needed when you have trained your model for some epochs, and after saving, you want to load it and resume training for several more epochs without disturbing the state of objects that depend on the epoch (for example, an optimizer). So you should set initial_epoch
= to a value less than the total number of epochs(i.e. we trained the model for, for example, 20 epochs, and epochs
= 40, and then everything will resume as if you initially trained the model for 20 epochs
in one training session.
However, note that when using the built-in Keras optimizers, you do not need to use initial_epoch
, since they store and update their state internally (without taking into account the value of the current epoch), and when saving the model, the state of the optimizer will also be saved.
I hope I helped you
Answered By - Tehnorobot
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.