Skip to content

Commit a74f653

Browse files
author
Molly Xu
committed
address feedback
1 parent 7ac0d2f commit a74f653

File tree

1 file changed

+22
-11
lines changed

1 file changed

+22
-11
lines changed

examples/decoding/performance_tips.py

Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -25,16 +25,17 @@
2525
# 1. **Batch APIs** - Decode multiple frames at once
2626
# 2. **Approximate Mode & Keyframe Mappings** - Trade accuracy for speed
2727
# 3. **Multi-threading** - Parallelize decoding across videos or chunks
28-
# 4. **CUDA Acceleration (BETA)** - Use GPU decoding for supported formats
28+
# 4. **CUDA Acceleration** - Use GPU decoding for supported formats
2929
#
3030
# We'll explore each technique and when to use it.
3131

3232
# %%
3333
# 1. Use Batch APIs When Possible
3434
# --------------------------------
3535
#
36-
# If you need to decode multiple frames at once, it is faster when using the batch methods. TorchCodec's batch APIs reduce overhead and can leverage
37-
# internal optimizations.
36+
# If you need to decode multiple frames at once, the batch methods are faster than calling single-frame decoding methods multiple times.
37+
# For example, :meth:`~torchcodec.decoders.VideoDecoder.get_frames_at` is faster than calling :meth:`~torchcodec.decoders.VideoDecoder.get_frame_at` multiple times.
38+
# TorchCodec's batch APIs reduce overhead and can leverage internal optimizations.
3839
#
3940
# **Key Methods:**
4041
#
@@ -59,7 +60,7 @@
5960
# 2. Approximate Mode & Keyframe Mappings
6061
# ----------------------------------------
6162
#
62-
# By default, TorchCodec uses ``seek_mode="exact"``, which performs a scan when
63+
# By default, TorchCodec uses ``seek_mode="exact"``, which performs a :term:`scan` when
6364
# the decoder is created to build an accurate internal index of frames. This
6465
# ensures frame-accurate seeking but takes longer for decoder initialization,
6566
# especially on long videos.
@@ -68,7 +69,7 @@
6869
# **Approximate Mode**
6970
# ~~~~~~~~~~~~~~~~~~~~
7071
#
71-
# Setting ``seek_mode="approximate"`` skips the initial scan and relies on the
72+
# Setting ``seek_mode="approximate"`` skips the initial :term:`scan` and relies on the
7273
# video file's metadata headers. This dramatically speeds up
7374
# :class:`~torchcodec.decoders.VideoDecoder` creation, particularly for long
7475
# videos, but may result in slightly less accurate seeking in some cases.
@@ -77,9 +78,7 @@
7778
# **Which mode should you use:**
7879
#
7980
# - If you care about exactness of frame seeking, use “exact”.
80-
# - If you can sacrifice exactness of seeking for speed, which is usually the case when doing clip sampling, use “approximate”.
81-
# - If your videos don’t have variable framerate and their metadata is correct, then “approximate” mode is a net win: it will be just as accurate as the “exact” mode while still being significantly faster.
82-
# - If your size is small enough and we’re decoding a lot of frames, there’s a chance exact mode is actually faster.
81+
# - If the video is long and you're only decoding a small amount of frames, approximate mode should be faster.
8382

8483
# %%
8584
# **Custom Frame Mappings**
@@ -113,9 +112,11 @@
113112
#
114113
# When decoding multiple videos or decoding a large number of frames from a single video, there are a few parallelization strategies to speed up the decoding process:
115114
#
116-
# - **FFmpeg-based parallelism** - Using FFmpeg's internal threading capabilities
115+
# - **FFmpeg-based parallelism** - Using FFmpeg's internal threading capabilities for intra-frame parallelism, where parallelization happens within individual frames rather than across frames
117116
# - **Multiprocessing** - Distributing work across multiple processes
118117
# - **Multithreading** - Using multiple threads within a single process
118+
#
119+
# Both multiprocessing and multithreading can be used to decode multiple videos in parallel, or to decode a single long video in parallel by splitting it into chunks.
119120

120121
# %%
121122
# .. note::
@@ -126,8 +127,8 @@
126127
# - :ref:`sphx_glr_generated_examples_decoding_parallel_decoding.py`
127128

128129
# %%
129-
# 4. BETA: CUDA Acceleration
130-
# ---------------------------
130+
# 4. CUDA Acceleration
131+
# --------------------
131132
#
132133
# TorchCodec supports GPU-accelerated decoding using NVIDIA's hardware decoder
133134
# (NVDEC) on supported hardware. This keeps decoded tensors in GPU memory,
@@ -150,6 +151,16 @@
150151
# especially for high-resolution videos and when combined with GPU-based transforms.
151152
# Actual speedup varies by hardware, resolution, and codec.
152153

154+
# %%
155+
# **Recommended Usage for Beta Interface**
156+
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
157+
#
158+
# .. code-block:: python
159+
#
160+
# with set_cuda_backend("beta"):
161+
# decoder = VideoDecoder("file.mp4", device="cuda")
162+
#
163+
153164
# %%
154165
# .. note::
155166
#

0 commit comments

Comments
 (0)