Tuesday, January 25, 2022

[FIXED] How to make a dataset from video datasets(tensorflow first)

January 25, 2022 dataset, hdf5, pytorch, tensorflow, tfrecord No comments

Issue

everyone.

Now I have an object classification task, and I have a dataset containing a large number of videos. In every video, some frames(not every frame, about 160 thousand frames) have its labels, since a frame may have multiple objects.

I have some confusion about creating the dataset. My idea is to convert videos to frames firstly, then every frame only with labels will be made as tfrecord or hdf5 format. Finally, I would write every frame's path into csv files (training and validation) using for my task.

My question is : 1. Is there efficient enough(tfrecord or hdf5)? Should I preprocess every frame such as compression for save the storage space before creating tfrecord or hdf5 files? 2. Is there a way to handle the video dataset directly in tensorflow or pytorch?

I want to find an efficient and conventional way to handle video datasets. Really looking forward to every answer.

Solution

I am no TensorFlow guy, so my answer won't cover that, sorry.

Video formats generally gain compression at the cost of longer random-access times thanks to exploiting temporal correlations in the data. It makes sense because one usually accesses video frames sequentially, but if your access is entirely random I suggest you convert to hdf5. Otherwise, if you access sub-sequences of video, it may make sense to stay with video formats.

PyTorch does not have any "blessed" approaches to video AFAIK, but I use imageio to read videos and seek particular frames. A short wrapper makes it follow the PyTorch Dataset API. The code is rather simple but has a caveat, which is necessary to allow using it with multiprocessing DataLoader.

import imageio, torch

class VideoDataset:
    def __init__(self, path):
        self.path = path

        # explained in __getitem__
        self._reader = None

        reader = imageio.get_reader(self.path, 'ffmpeg')
        self._length = reader.get_length()

    def __getitem__(self, ix):
        # Below is a workaround to allow using `VideoDataset` with
        # `torch.utils.data.DataLoader` in multiprocessing mode.
        # `DataLoader` sends copies of the `VideoDataset` object across
        # processes, which sometimes leads to bugs, as `imageio.Reader`
        # does not support being serialized. Since our `__init__` set
        # `self._reader` to None, it is safe to serialize a
        # freshly-initialized `VideoDataset` and then, thanks to the if
        # below, `self._reader` gets initialized independently in each
        # worker thread.

        if self._reader is None:
            self._reader = imageio.get_reader(self.path, 'ffmpeg')

        # this is a numpy ndarray in [h, w, channel] format
        frame = self._reader.get_data(ix)

        # PyTorch standard layout [channel, h, w]
        return torch.from_numpy(frame.transpose(2, 0, 1))

     def __len__(self):
        return self.length

This code can be adapted to support multiple video files as well as to output the labels as you would like to have them.

Answered By - Jatentaki

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, January 25, 2022

[FIXED] How to make a dataset from video datasets(tensorflow first)

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels