Issue
I have a generator yielding data and labels yield data, labels
where the data is
an numpy.ndarray
with variable rows and 500 columns of type dtype=float32
and the labels are integers of numpy.int64
.
I'm trying to pass this data into TensorFlow from_generator function to create a TensorFlow dataset: tf.data.Dataset.from_generator
The docs say that the from_generator function needs a parameter output_signature
as an input. But I'm having trouble understanding how to build this output_signature.
How can I make the output_signature for the generator I described?
Thank you!
Edit:
I used tf.type_spec_from_value
to get this:
dataset = tf.data.Dataset.from_generator(
datagen_row,
output_signature=(
tf.TensorSpec(shape=(None, 512), dtype=tf.float32, name=None),
tf.TensorSpec(shape=(), dtype=tf.int64, name=None)
)
)
But is it correct to use None when the number of rows is varying for the first data type?
Solution
if your datagen_row() function yields input_data, label with format 500 and 1 than your output_signature should be:
output_signature=(
tf.TensorSpec(shape=(None, 500), dtype=tf.float32, name=None),
tf.TensorSpec(shape=(), dtype=tf.int64, name=None))
where the first TensorSpec is for the data format and the second one for the label format. But it would be helpful if you post the function + maybe data examples or data shape here. Otherwise it is hard to help.
Answered By - Finn Meyer
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.