Issue
Background: My data has shape of (batch_size, data_length) and the dimensions seem to be incompatible with the inside MultiHeadAttention operations, especially softmax. Someone kindly suggested that I should use a size 1 ghost dimension as the last dimension.
Error message I got:
(32, 512, 1)
Epoch 1/200
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-7-870abeaa4b93> in <cell line: 281>()
279 model.compile(loss="mean_squared_error", optimizer="rmsprop", metrics=["accuracy"])
280
--> 281 history = model.fit(dataset, epochs=200, validation_data=val_dataset)
1 frames
/usr/local/lib/python3.10/dist-packages/keras/engine/training.py in tf__train_function(iterator)
13 try:
14 do_return = True
---> 15 retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False
ValueError: in user code:
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1284, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1268, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1249, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1050, in train_step
y_pred = self(x, training=True)
File "/usr/local/lib/python3.10/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer 'query' (type EinsumDense).
Dimensions must be equal, but are 1 and 512 for '{{node model/multi_head_attention/query/einsum/Einsum}} = Einsum[N=2, T=DT_HALF, equation="abc,cde->abde"](model/Cast, model/multi_head_attention/query/einsum/Einsum/Cast)' with input shapes: [32,512,1], [512,2,512].
Call arguments received by layer 'query' (type EinsumDense):
• inputs=tf.Tensor(shape=(32, 512, 1), dtype=float16)
Solution
Your error message mentions an einsum "equation": "abc,cde->abde" (Numpy has a very clear explanation of einsum if you aren't familiar with it). This is telling you that this specific error (though possibly not your original error, I can't tell) is caused because this specific step expects the last axis of the array to be the same as the first axis of the next. It also tells you it expects 3 axes in both arrays.
Your second array obviously already has 3 axes. You said you added a unit axis to your other array. This is reasonable, though it has to be put in the right place, as the second axis instead of the third: x_train = np.expand_dims(x_train, 1)
. This will make the dimensions match.
PS: You say you are giving a "Brief reproducible code". It may be brief, but it's not reproducible: you've left out the imports, you haven't provided dataset
or val_dataset
, you get an IndexError
at padding_mask_init[rows, 1:, 1] = 1
and you have a syntax error on the line shape = (batch_size, embed_dim, ghost_dim=1)
: this isn't valid Python. Next time, in order to ensure you are actually attaching a reproducible example, create a new file (example.py), paste your code into the file, and try to run it with python example.py
. Until this produces the error you're wanting help with, it's not a reproducible example.
Answered By - David
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.