Saturday, June 18, 2022

[FIXED] How to keep track of original images after using transforms in a pytorch model?

June 18, 2022 data-visualization, python, pytorch No comments

Issue

I am working on an AI related issue, where I need to track several human bodyparts on videos. I create a DataLoader with my images and i make several transforms when calling my Dataset class .

Here is a code sample :

transform = transforms.Compose(
        [
            transforms.Resize(img_size),
            transforms.ToTensor(),
            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
        ]
    )

dataset = NamedClassDataset(annotation_folder_path=path, transform=transform, img_size=img_size, normalized=normalize)
train_set, validation_set = torch.utils.data.random_split(dataset, get_train_test_size(dataset,train_percent))
train_loader = DataLoader(dataset=train_set, shuffle=shuffle, batch_size=batch_size,num_workers=num_workers,pin_memory=pin_memory)
validation_loader = DataLoader(dataset=validation_set, shuffle=shuffle, batch_size=batch_size,num_workers=num_workers, pin_memory=pin_memory)

The problem is : after running my model, I display images with the predicted points in order to see their quality. But since images are resized and normalized, I cannot retrieve their original quality and color. I would like to display points on the original images instead of the transformed images and I want to know what is the usual way to do this.

I already have thought of two solutions with their respective disadvantages :

Reverting transformations, but impossible when resize is called since we loose information
Returning an index as a third argument in the __getitem__ method of the NamedClassDataset (along with the image and labels). But pytorch methods expects only two outputs when using __getitem__ which are (image, associated labels).

EDIT : Here is the getitem of my NamedClassDataset class :

def __getitem__(self, index):
        (img_path, coords) = self.annotations.iloc[index].values
        img = Image.open(img_path).convert("RGB")
        w,h = img.size
        # Normalize by img size
        if self.img_size is not None:
            if self.normalized:
                coords = coords/(w,h) # Normalized
            else:
                n_h,n_w = self.img_size
                coords = coords/(w,h)*(n_w,n_h) # Not normalized 
            

        y_coords = torch.flatten(torch.tensor(coords)).float() # Flatten outputs and convert from double to float32

        if self.transform is not None:
            img = self.transform(img)

        return (img, y_coords)

Solution

I managed to do the trick by declaring another dataset with the original images.

# Create the same dataset with untransformed images for visualization purposes
org_dataset = NamedClassDataset(annotation_folder_path="./12_labels/extracted_swimmers", transform=None, img_size=None, normalized=False)
viz_train_set, viz_validation_set = random_split(org_dataset, get_train_test_size(org_dataset,train_percent,_print_size=False), generator=torch.Generator().manual_seed(seed))

And here is what I do in the __getitem__ when transform=None :

        if self.transform is not None:
            tr_img = self.transform(org_img)
            return (tr_img, y_coords)
        return (org_img, y_coords)

I then have access to original images by passing viz sets as parameters. Do note that this is a Dataset and not a Dataloader so you need to take in account your batch size in order to match the predictions. e.g. :

plot_predictions(viz_set[0+i*batch_size][0], preds[0])

I let the feed open since I strongly believe that a more efficient answer can be provided.

Answered By - Mrofsnart

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, June 18, 2022

[FIXED] How to keep track of original images after using transforms in a pytorch model?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels