Issue
Let's say, I am training an autoencoder (so I need to define the input dataset, and also the target output). And I need a dataset that's just images (no labels).
I've tried using flow_from_directory()
, but it assigns a class to the dataset, and when passed into training, it will collide with the target data, producing an error.
So I guess what I need is to convert my local images into a dataset with a structure like tensorflow_datasets.mnist
.
Folder structure:
/data
/low
-0.png
-1.png
-...
/high
-0.png
-1.png
-...
What I've tried:
low_generator = keras.preprocessing.image.ImageDataGenerator(
rescale=1/255.0,
validation_split=0.2
)
# when path is directly to the image folder - no images found
# when path is to parent folder, specifying which folder to use - it assigns labels too
train_low_iterator = low_generator.flow_from_directory(
# 'path to parent directory'
'path to directory',
target_size=(480, 270),
batch_size=10,
class_mode='input',
subset='training',
# add this when path is to parent
# classes=['low']
)
validation_low_iterator = low_generator.flow_from_directory(
'same as above',
target_size=(480, 270),
batch_size=10,
class_mode='input',
subset='validation',
# same as above
classes=['low']
)
# analogic to above
high_generator
train_high_iterator
validation_high_iterator
Class_mode None
Source code says, that if None
is used as class_mode
, it won't be yielding the labels. Source
But neither of these examples worked (same issue as before, either nothing is found, or it's yielding labels again:
iterator = generator.flow_from_directory(
'parent_path',
class_mode=None,
classes=['something']
)
iterator = generator.flow_from_directory(
'parent_path',
classes=['something']
)
iterator = generator.flow_from_directory(
'direct_path',
class_mode=None
)
iterator = generator.flow_from_directory(
'direct_path'
)
I've also tried image_dataset_from_directory()
train_low_dataset = keras.utils.image_dataset_from_directory(
'path/low',
labels = None,
label_mode = None,
color_mode = 'rgb',
batch_size = 32,
image_size = (480, 270),
shuffle = False,
validation_split = 0.2,
subset = 'training'
)
This is able to load all the data, return a dataset, but it throws an error (at train start):
raise ValueError("'y' argument is not supported when using " ValueError: 'y' argument is not supported when using python generator as input.
Which I'm not able to resolve right now (since I need to use both input and output data, and both train and validation).
Training
model.fit(
train_low_iterator, train_high_iterator,
epochs=15,
batch_size=8,
shuffle=True,
validation_data=(validation_low_iterator, validation_high_iterator)
)
Solution
I tried to create a custom iterator function (a fancy for
loop with yield
at the end), but unsuccessfully (I'll retry in the future, and will update this answer if I get it properly working).
However what worked, was a generator object converted into a numpy array
.
Folder structure as before:
/data
/low
-0.png
-1.png
-...
/high
-0.png
-1.png
-...
Creating generator
low = np.array([cv2.imread(f.path) / 255 for f in os.scandir("/data/low")])
high = np.array([cv2.imread(f.path) / 255 for f in os.scandir("/data/high")])
train_low = low[:205]
validate_low = low[205:]
train_high = high[:205]
validate_out = high[205:]
Training
model.fit(x=train_low, y=train_high,
epochs=10,
batch_size=1,
shuffle=True,
validation_data=(validate_low, validate_high),
)
Answered By - Mahrkeenerh
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.