Issue
I was following DCGAN tutorial(https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html) and I've got problem with format of output they get from their dataloader. So, in short, if they need any data from dataloader they call it like that:
real_batch = next(iter(dataloader))
(real_batch[0].to(device)[:64]
or
for i, data in enumerate(dataloader, 0):
real_cpu = data[0].to(device)
and that data[0]
they are calling has batch size (set up to be 128), in the first example they need 64 data samples, so they use [:64]
to cut them.
The problem is that my dataloader doesn't follow such behaviour, and it spit out batch size just after next
call or in the enumerate
cycle, calling data[0]
for my dataloader return just one sample, not entire batch like in the example. I found this extremely weird, because just by removing [0]
in each data load I make my code run without any errors, but I'm afraid that I'm missing some important part of making data to specific shape and that could cause some errors.
This is how their dataloader has been set up:
dataset = dset.ImageFolder(root=dataroot,
transform=transforms.Compose([
transforms.Resize(image_size),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
]))
dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
shuffle=True, num_workers=workers)
My dataloader set up is a bit tricky, I have a simple custom dataset with data being a list of one channel images (I'm creating this list myself with augmentation function called on images stored on disk, and I'm not sure if this is a good way to store such data...), and basically more or less the same dataloader params.
class MyDataset(torch.utils.data.Dataset):
def __init__(self, dataset, image_size):
super(MyDataset, self).__init__()
# dataset is [list] of PIL images with 1 channel
self.dataset = dataset
self.image_size = image_size
self.transform=transforms.Compose([transforms.Resize(self.image_size),
transforms.ToTensor(),
transforms.Normalize((0.5), (0.5))])
def __getitem__(self, idx):
x = self.dataset[idx]
return self.transform(x)
def __len__(self):
return len(self.dataset)
and dataloader itself:
train_set = MyDataset(data, image_size=image_size)
data_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True)
Solution
The difference is that typically the implemented datasets will return both the image AND the corresponding label, i.e. the implementation of the __getitem__
method is something like:
def __getitem__(self, idx):
return self.image[idx], self.target[idx]
Then, the dataloader returns a tuple: data = (images, targets)
, both of the same batch size. They access images by taking data[0]
.
In your case, your __getitem__
returns only one output, and the dataloader will collate it into simple data = images
.
So, removing [0]
, as you tried, is actually a correct thing to do! For compatibility with other already implemented datasets, I sometimes return a dummy label together with the sample in __getitem__
, e.g. return self.transform(x), 0
(you can try it -- then calling data[0]
will work).
Answered By - burzan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.