Issue
I have a classic dataset of images and labels.
Here is a simple representation of the __getitem__
function :
def __getitem__(self, index):
(img_path, label) = df.iloc[index].values
img = Image.open(img_path).convert("RGB")
y = torch.tensor(labels))
return (img, y)
I have :
dataset = ClassDataset()
train_set, validation_set = random_split(dataset)
train_loader = DataLoader(dataset=train_set)
The size of one batch of the train loader would be : [32,3,256,256]
With 32 being the batch size, 3 the number of channels and 256 the width and height of my image.
I want to modify the shape of one batch so that it is sequential [8,4,3,256,256]
with 8 being the batch size and 4 the length of one sequence.
I know that it could be easily done with torch.view()
or torch.reshape()
knowing that my data are already in the right order (they can be grouped directly into sequences).
But I want to know where is the most intelligent place to make this change, in the dataset class, in the dataloader class or in the train loop.
I already tried passing sequences into the getitem :
(img_path, coords) = df.iloc[4*(index-1):4*index].values
(assuming that sequence length is 4), but it didn't work.
Solution
It is more relevant to do this kind of processing in the dataset layer. Indeed, what you are looking to implement there is "given a dataset index index
return the corresponding input and its label". In your case you are dealing with a sequence as input, so something like this makes sense for your __getitem__
to return a sequence of images.
The data loader will automatically collate the data such that you get (batch_size, seq_len, channel, height, width)
for your input, and (batch_size, seq_len)
for your label (or (batch_size,)
if there is meant to be a single label per sequence).
Answered By - Ivan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.