Issue
So I'm working with the example notebook that tensorflow provides to detail working with time-series formatted data.
https://www.tensorflow.org/tutorials/structured_data/time_series
Everything is going fine, just had a quick question on saving and loading models. For my current research, I need to be able to be able to train a model, save it, and then be able to reload it for testing at a later time.
The entire code for the notebook can be found at the link above, but essentially the training and compiling process involves the following method, where a model and window shape are fed into
MAX_EPOCHS = 20
def compile_and_fit(model, window, patience=2):
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
patience=patience,
mode='min')
model.compile(loss=tf.keras.losses.MeanSquaredError(),
optimizer=tf.keras.optimizers.Adam(),
metrics=[tf.keras.metrics.MeanAbsoluteError()])
history = model.fit(window.train, epochs=MAX_EPOCHS,
validation_data=window.val,
callbacks=[early_stopping])
return history
The model in question looks like
conv_model = tf.keras.Sequential([
tf.keras.layers.Conv1D(filters=32,
kernel_size=(CONV_WIDTH,),
activation='relu'),
tf.keras.layers.Dense(units=32, activation='relu'),
tf.keras.layers.Dense(units=1),
])
In the notebook, this is essentially the process that runs the training/compiling method and tests to see if it evaluates
history = compile_and_fit(conv_model, conv_window)
IPython.display.clear_output()
val_performance['Conv'] = conv_model.evaluate(conv_window.val)
performance['Conv'] = conv_model.evaluate(conv_window.test, verbose=0)
After this it is tested on a wider window in the following procedure
wide_window = WindowGenerator(
input_width=24, label_width=24, shift=1,
label_columns=['T (degC)'])
print("Wide window")
print('Input shape:', wide_window.example[0].shape)
print('Labels shape:', wide_window.example[1].shape)
print('Output shape:', conv_model(wide_window.example[0]).shape)
This part works fine, but if I add in two lines to save and reload the model as shown
history = compile_and_fit(conv_model, conv_window)
conv_model.save('test.keras')
conv_model = tf.keras.models.load_model('test.keras')
IPython.display.clear_output()
val_performance['Conv'] = conv_model.evaluate(conv_window.val)
performance['Conv'] = conv_model.evaluate(conv_window.test, verbose=0)
and then run
print("Wide window")
print('Input shape:', wide_window.example[0].shape)
print('Labels shape:', wide_window.example[1].shape)
print('Output shape:', conv_model(wide_window.example[0]).shape)
I receive the following error.
Wide window
Input shape: (32, 24, 19)
Labels shape: (32, 24, 1)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[58], line 4
2 print('Input shape:', wide_window.example[0].shape)
3 print('Labels shape:', wide_window.example[1].shape)
----> 4 print('Output shape:', conv_model(wide_window.example[0]).shape)
File ~/py38-env/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.__traceback__)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
File ~/py38-env/lib/python3.8/site-packages/keras/src/engine/input_spec.py:298, in assert_input_compatibility(input_spec, inputs, layer_name)
296 if spec_dim is not None and dim is not None:
297 if spec_dim != dim:
--> 298 raise ValueError(
299 f'Input {input_index} of layer "{layer_name}" is '
300 "incompatible with the layer: "
301 f"expected shape={spec.shape}, "
302 f"found shape={display_shape(x.shape)}"
303 )
ValueError: Input 0 of layer "sequential_3" is incompatible with the layer: expected shape=(None, 3, 19), found shape=(32, 24, 19)
This occurs when I save and reload using the .h5 file format as well. Even if I change the name and try again it still throws an error. Note that the window size it is trained on is
CONV_WIDTH = 3
conv_window = WindowGenerator(
input_width=CONV_WIDTH,
label_width=1,
shift=1,
label_columns=['T (degC)'])
but it should generalize to wider windows, and indeed does when the model is not saved and reloaded.
Any insight into why this is occurring would be greatly appreciated, thanks!
Solution
It seems that after saving, the input_shape
of the model is not flexible anymore.
You can see the expected input shape with print(conv_model.input_shape)
.
If we go through the training, we can see how the input_shape changes. Note that this is without saving at the moment:
conv_model = tf.keras.Sequential([...])
After model creation, conv_model
has no .input_shape
, as there is no Input
layer nor a input_shape=(...)
parameter for the first (conv1d) layer. The model got no data yet and has no idea what input shape to expect.
print('Output shape:', conv_model(wide_window.example[0]).shape)
Now the model got data, and we get conv_model.input_shape=(32, 3, 19)
. This is a very explicit shape, as normally the first dimension, the batch dimension, would be None
, indicating a flexible shape for this dimension. That is because often the last batch is not guaranteed to be of length 32 (with batch_size=32
), but could be the the remainder of the data.
history = compile_and_fit(conv_model, conv_window)
Now the model gets the full data, with different batch lengths, and we get conv_model.input_shape=(None, 3, 19)
. The window length and features are still fixed, as they are the same for each step, but now the first dimension got flexible in shape.
print('Output shape:', conv_model(wide_window.example[0]).shape)
If we give the model the wide_window
as input, the input shape changes again: conv_model.input_shape=(None, None, 19)
. Now the time axis got flexible too, as the previous value 3 didn't fit anymore. Note that this only works because the model had no input_shape to begin with. If you add a tf.keras.layers.Input(3, 19)
layer to the Sequential
model, the same error will occure as in your question.
When you save and load the model after training, it seems that the shape (None, 3, 19)
is made fix, as it would be fix when you set the input shape yourself with e.g. an Input
layer.
The only (important) difference between the loaded and the normal model is the .input_spec
attribute:
conv_model.input_spec = None
loaded_conv_model.input_spec = [InputSpec(shape=(None, 3, 19), ndim=3)]
If you set the attribute back to None (loaded_conv_model.input_spec=None
), it now works with flexible input again, but this seems a bit hacky. If you know that you'll work with flexible time axis data (not all sequences are the same length), you can set it directly in the model:
conv_model = tf.keras.Sequential([
tf.keras.layers.Input((None, 19)), # batch dimension gets omitted here
... # rest of the model
])
Now the model got the input shape conv_model.input_shape=(None, None, 19)
and is fine with different sized batches and window lengths.
Answered By - mhenning
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.