|
25 | 25 | # 1. **Batch APIs** - Decode multiple frames at once |
26 | 26 | # 2. **Approximate Mode & Keyframe Mappings** - Trade accuracy for speed |
27 | 27 | # 3. **Multi-threading** - Parallelize decoding across videos or chunks |
28 | | -# 4. **CUDA Acceleration (BETA)** - Use GPU decoding for supported formats |
| 28 | +# 4. **CUDA Acceleration** - Use GPU decoding for supported formats |
29 | 29 | # |
30 | 30 | # We'll explore each technique and when to use it. |
31 | 31 |
|
32 | 32 | # %% |
33 | 33 | # 1. Use Batch APIs When Possible |
34 | 34 | # -------------------------------- |
35 | 35 | # |
36 | | -# If you need to decode multiple frames at once, it is faster when using the batch methods. TorchCodec's batch APIs reduce overhead and can leverage |
37 | | -# internal optimizations. |
| 36 | +# If you need to decode multiple frames at once, the batch methods are faster than calling single-frame decoding methods multiple times. |
| 37 | +# For example, :meth:`~torchcodec.decoders.VideoDecoder.get_frames_at` is faster than calling :meth:`~torchcodec.decoders.VideoDecoder.get_frame_at` multiple times. |
| 38 | +# TorchCodec's batch APIs reduce overhead and can leverage internal optimizations. |
38 | 39 | # |
39 | 40 | # **Key Methods:** |
40 | 41 | # |
|
59 | 60 | # 2. Approximate Mode & Keyframe Mappings |
60 | 61 | # ---------------------------------------- |
61 | 62 | # |
62 | | -# By default, TorchCodec uses ``seek_mode="exact"``, which performs a scan when |
| 63 | +# By default, TorchCodec uses ``seek_mode="exact"``, which performs a :term:`scan` when |
63 | 64 | # the decoder is created to build an accurate internal index of frames. This |
64 | 65 | # ensures frame-accurate seeking but takes longer for decoder initialization, |
65 | 66 | # especially on long videos. |
|
68 | 69 | # **Approximate Mode** |
69 | 70 | # ~~~~~~~~~~~~~~~~~~~~ |
70 | 71 | # |
71 | | -# Setting ``seek_mode="approximate"`` skips the initial scan and relies on the |
| 72 | +# Setting ``seek_mode="approximate"`` skips the initial :term:`scan` and relies on the |
72 | 73 | # video file's metadata headers. This dramatically speeds up |
73 | 74 | # :class:`~torchcodec.decoders.VideoDecoder` creation, particularly for long |
74 | 75 | # videos, but may result in slightly less accurate seeking in some cases. |
|
77 | 78 | # **Which mode should you use:** |
78 | 79 | # |
79 | 80 | # - If you care about exactness of frame seeking, use “exact”. |
80 | | -# - If you can sacrifice exactness of seeking for speed, which is usually the case when doing clip sampling, use “approximate”. |
81 | | -# - If your videos don’t have variable framerate and their metadata is correct, then “approximate” mode is a net win: it will be just as accurate as the “exact” mode while still being significantly faster. |
82 | | -# - If your size is small enough and we’re decoding a lot of frames, there’s a chance exact mode is actually faster. |
| 81 | +# - If the video is long and you're only decoding a small amount of frames, approximate mode should be faster. |
83 | 82 |
|
84 | 83 | # %% |
85 | 84 | # **Custom Frame Mappings** |
|
113 | 112 | # |
114 | 113 | # When decoding multiple videos or decoding a large number of frames from a single video, there are a few parallelization strategies to speed up the decoding process: |
115 | 114 | # |
116 | | -# - **FFmpeg-based parallelism** - Using FFmpeg's internal threading capabilities |
| 115 | +# - **FFmpeg-based parallelism** - Using FFmpeg's internal threading capabilities for intra-frame parallelism, where parallelization happens within individual frames rather than across frames |
117 | 116 | # - **Multiprocessing** - Distributing work across multiple processes |
118 | 117 | # - **Multithreading** - Using multiple threads within a single process |
| 118 | +# |
| 119 | +# Both multiprocessing and multithreading can be used to decode multiple videos in parallel, or to decode a single long video in parallel by splitting it into chunks. |
119 | 120 |
|
120 | 121 | # %% |
121 | 122 | # .. note:: |
|
126 | 127 | # - :ref:`sphx_glr_generated_examples_decoding_parallel_decoding.py` |
127 | 128 |
|
128 | 129 | # %% |
129 | | -# 4. BETA: CUDA Acceleration |
130 | | -# --------------------------- |
| 130 | +# 4. CUDA Acceleration |
| 131 | +# -------------------- |
131 | 132 | # |
132 | 133 | # TorchCodec supports GPU-accelerated decoding using NVIDIA's hardware decoder |
133 | 134 | # (NVDEC) on supported hardware. This keeps decoded tensors in GPU memory, |
|
150 | 151 | # especially for high-resolution videos and when combined with GPU-based transforms. |
151 | 152 | # Actual speedup varies by hardware, resolution, and codec. |
152 | 153 |
|
| 154 | +# %% |
| 155 | +# **Recommended Usage for Beta Interface** |
| 156 | +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 157 | +# |
| 158 | +# .. code-block:: python |
| 159 | +# |
| 160 | +# with set_cuda_backend("beta"): |
| 161 | +# decoder = VideoDecoder("file.mp4", device="cuda") |
| 162 | +# |
| 163 | + |
153 | 164 | # %% |
154 | 165 | # .. note:: |
155 | 166 | # |
|
0 commit comments