Thursday, April 21, 2022

[FIXED] Get the right tensor size

April 21, 2022 conv-neural-network, python, pytorch, tensor No comments

Issue

I want to have the right tensor size, because I get the following error in line of loss = criterion(out,target):

Expected input batch_size (4200) to match target batch_size (64).

How can I solve the challenge?

My output tensor has the size ([4200, 2]) and my target tensor ([64,2]). The usecase is an image classification. There are two classes. My batch size is 64 and the images has a size of 180 x 115px in grayscale. Please don't confiuse: There are some 'break' to test the code in the early state of development. I load four batches, so 256 images.

With this method i will load my images:

def dataPrep(list_of_data, data_path, category, quantity):
    global train_data
    target_list = []
    train_data_list = []
    
    transform = transforms.Compose([
    transforms.ToTensor(),
        ])
    
    len_data = len(train_data)
    print('Len_data: ', len_data)
    for item in list_of_data:
        f = random.choice(list_of_data)
        list_of_data.remove(f)
        print(data_path + f)
        try:
            img = Image.open(data_path +f)
        except:
            continue
        img_crop = img.crop((310,60,425,240))
        img_tensor = transform(img_crop)
        print(img_tensor.size())
        train_data_list.append(img_tensor)
        isPseudo = 0
        isTrue = 1
        if category == True:
            target = [isPseudo,isTrue]
        else:
            isPseudo =1
            isTrue = 0        
            target = [isPseudo, isTrue]
        
        target_list.append(target)
        if len(train_data_list) >=64:
            train_data.append((torch.stack(train_data_list), target_list))
            train_data_list = []
            target_list = []
            
        if (len_data*64 + quantity) <= len(train_data)*64:
            break
    print(len(train_data) *64)    
    return list_of_data

After I loaded the images, I create my model and optimizer.

model = net.Netz()
optimizer = optim.SGD(model.parameters(), lr= 0.1, momentum = 0.8)

My class 'Netz' looks like this:

class Netz(nn.Module):
    def __init__(self):
        super(Netz, self).__init__()
        self.conv1 = nn.Conv2d(1,10, kernel_size=5)
        self.conv2 = nn.Conv2d(10,20, kernel_size = 5)
        self.conv_dropout = nn.Dropout2d() 
        self.fc1 = nn.Linear(320,60)
        self.fc2 = nn.Linear(60,2)
    
    def forward(self,x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)
        x = self.conv2(x)
        x = self.conv_dropout(x)
        x = F.max_pool2d(x,2)
        x = F.relu(x)
        x = x.view(-1,320)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, -1)

At the end I will train my CNN:

def trainM(epoch):
    model.train()
    batch_id = 0
    for batch_id, (data, target) in enumerate(net.train_data):
        #data = data.cuda()
        #target = target.cuda()
        target = torch.Tensor(target[64*batch_id:64*(batch_id+1)])
        data = Variable(data)
        target = Variable(target)
        optimizer.zero_grad()
        out = model(data)
        criterion = F.nll_loss
        print('Size of out:', out.size())
        print('Size of target:', target.size())
        loss = criterion(out,target)
        loss.backward()
        optimizer.step()
        print('Tain Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(epoch,batch_id*len(data), len(net.train_data.dataset), 100*batch_id/len(net.train_data), loss.item()))
        batch_id += 1
        break

for item in range(0,10):
    trainM(item)
    break

Solution

The main problem is at Netz x = x.view(-1,320) you have 64 batches 20 channels 42 x 25 width and height if you reshape it to -1, 320 would get 4200 by 320.

I can suggest 3 possible options to preserve the batchsize;

(what is generally done) pad the input to become square, update convolution part so that its output before FC layers have high number of channels and small number of height and width. for instance obtain x.shape = (batchsize, 128,2,2) then make the fc1 = Linear(512, 60) and before that do x = x.reshape(x.shape[0], -1). (here before applying fc1 you may do a 1x1 convolution).
make the number of channels at the end of convolutions 1 i.e. get something like x.shape = (batchsize,1,42,25) then adopt fc1 accordingly.
dox=reshape(*x.shape[:2], -1) in other words preserve both chanel and batchsize. Add another FC layer fc_e = Linear(20,1) to to compress your channels.

class Netz(nn.Module):
    def __init__(self):
        super(Netz, self).__init__()
        self.conv1 = nn.Conv2d(1,10, kernel_size=5)
        self.conv2 = nn.Conv2d(10,20, kernel_size = 5)
        self.conv_dropout = nn.Dropout2d() 
        self.fc1 = nn.Linear(1050,60)
        self.fc2 = nn.Linear(60,2)
        self.fce = nn.Linear(20,1)
    
    def forward(self,x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)
        x = self.conv2(x)
        x = self.conv_dropout(x)
        x = F.max_pool2d(x,2)
        x = F.relu(x)
        x = x.reshape(x.shape[0],x.shape[1], -1)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        x = self.fce(x.permute(0,2,1)).squeeze(-1)
        return F.log_softmax(x, -1)

keep in mind that you have a trade off between the amount of information you want to be represented (should be high) and the number of inputs of Linear layer (not that high). At the end its a matter of how you chose to tackle the issue. The third one is the closest to your solution but I recommend figuring out a model complying with the first approach

Answered By - D. ACAR

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, April 21, 2022

[FIXED] Get the right tensor size

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels