Sunday, October 31, 2021

[FIXED] Custom dataset os.path.join () returns type error

October 31, 2021 pytorch No comments

Issue

I am writing custom dataset but it returns type error when I am merging root directory path with pandas iloc of image name in csv file:

 img_path = os.path.join(self.root_dir, self.annotations.iloc[index,0])

error: TypeError: join() argument must be str or bytes, not 'int64'

I have tried converting the annotation.iloc to string type but it's still giving me the same error.

csvfile with filenames and labels:

custom dataset class:

class patientdataset(Dataset):

    def __init__(self, csv_file, root_dir, transform=None):  
            self.annotations = pd.read_csv(csv_file)
            self.root_dir = root_dir
            self.transform = transform



    def __len__(self):
        return len(self.annotations)

    def __getitem__(self, index):
        img_path = os.path.join(self.root_dir, self.annotations.iloc[index,0])  
        image= np.array(np.load(img_path)) 
        y_label = torch.tensor(self.annotations.iloc[index, 1]).long()


        if self.transform:
            imagearrays = self.transform(image)
            image = imagearrays[None, :, :, :]
            imaget = np.transpose(image, (0, 2, 1, 3))
            image = imaget


        return (image, y_label)

Solution

According to your dataset (attached csv file), pd.read_csv(csv_file) produces dataframe with 3 columns: 1st for index, 2nd for filename, and 3rd for label. And this img_path = os.path.join(self.root_dir, self.annotations.iloc[index,0]) line is not working because iloc[index, 0] is about the 1st column, it will extract index data and not a file name, and join expects to get 2 strings, that's why you are getting TypeError.

Based on your csv file example, you should do:

class patientdataset(Dataset):

    def __init__(self, csv_file, root_dir, transform=None):  
            self.annotations = pd.read_csv(csv_file)
            self.root_dir = root_dir
            self.transform = transform



    def __len__(self):
        return len(self.annotations)

    def __getitem__(self, index):
        img_path = os.path.join(self.root_dir, self.annotations.iloc[index, 1])  # 1 - for file name (2nd column)  
        image= np.array(np.load(img_path)) 
        y_label = torch.tensor(self.annotations.iloc[index, 2]).long()  # 2 - for label (3rd column)


        if self.transform:
            imagearrays = self.transform(image)
            image = imagearrays[None, :, :, :]
            imaget = np.transpose(image, (0, 2, 1, 3))
            image = imaget


        return (image, y_label)

Answered By - trsvchn

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, October 31, 2021

[FIXED] Custom dataset os.path.join () returns type error

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels