Support Compressed Audio/Video Containers
Currently a video must be represented as a sequence of frames on disk. This has the advantage of access to any frame at the speed of a disk-seek operation (very fast on SSDs and RAM-disks). The disadvantage is that images each have to be compressed individually (i.e. information shared between frames is not de-duplicated). This means that generally there is computation to re-construct frames from a compressed video, but depending on the underlying hardware, which usually means slower reads, but in some cases there be speedups provided when sequential frames are needed.
The task is to support the kwcoco API where audio-video streams can be represented by standard mp4 (x265 / AAC) videos. These are modern (2024) and well supported codecs:
- The x265 is an open source implementation of the HEVC/H.265 efficient video compression format.
- The AAC is a well supported audio stream (format / codec , todo: fix termonology).
I have code where I've played with parsing information out of these, so I have a few examples using ffmpeg and cv2 to do video reading to help implement imread
in the context of a video.
Things that need to happen:
- Add delayed-image support for reading individual frames from a video file. (It would also be nice to have an optimized path for sequential access).
- Implement a database table structure for Videos.
I'd love to hear others thoughts here. I would like to have a discussion in the comments for how to build a concrete implementation plan.