Issue
Program to convert a video file into a NumPy array and vice-versa. I had searched for many search engines but was unable to find the answer.
Solution
There are multiple libraries people use for this (i.e. PyAV
, decord
, opencv
); I personally use Python OpenCV for this a lot (mostly with PyTorch, but it's a similar principle), so I'll speak about my experience there. You can use cv2.VideoCapture
to load a video file into a numpy
array; in theory, you can also use cv2.VideoWriter
to write it back, but in practice, I've had a hard time getting that to work in my own projects.
Video to Numpy Array
tl;dr: Create a cv2.VideoCapture
wrapper; iteratively load images (i.e. frames) from the video.
frames = []
path = "/path/to/my/video/file.mp4"
cap = cv2.VideoCapture(path)
ret = True
while ret:
ret, img = cap.read() # read one frame from the 'capture' object; img is (H, W, C)
if ret:
frames.append(img)
video = np.stack(frames, axis=0) # dimensions (T, H, W, C)
Do note that the images will be returned in BGR channel format rather than the more common RGB; if you need to convert it to the RGB colorspace, img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
will be sufficient.
Numpy Array to Video
In theory, the examples I've seen for using cv2.VideoWriter
go something like
# let `video` be an array with dimensionality (T, H, W, C)
num_frames, height, width, _ = video.shape
filename = "/path/where/video/will/be/saved.mp4"
codec_id = "mp4v" # ID for a video codec.
fourcc = cv2.VideoWriter_fourcc(*code)
out = cv2.VideoWriter(filename, fourcc, 20, (width, height))
for frame in np.split(video, num_frames, axis=0):
out.write(frame)
You can also save the frames to temporary images (there exist many np.ndarray
-> image pipelines; I personally use Pillow), then use ffmpeg
(a command-line utility) to encode the frames into a video file. This takes up significantly more space though, and I use this method when I need to inspect the individual frames of my video array (in that case, I use ffmpeg
, but that's a different conversation).
And on another note -- you may want to change the codec_id
variable depending on how you want to encode the video (if this means nothing to you, don't worry -- it probably won't matter for your application); this is simply a four-byte code used to identify the video codec used to generate the video (see this page; availability may vary by platform(. H.264 is the most common one in use today AFAIK, which is given by code "H264" or "X264", but I've had trouble getting this to work with OpenCV (more details here); however, the array -> images -> video file
approach works seamlessly with ffmpeg
from the command line.
Answered By - chang_trenton
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.