Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/torchcodec/decoders/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,6 @@
from .._core import AudioStreamMetadata, VideoStreamMetadata
from ._audio_decoder import AudioDecoder # noqa
from ._decoder_utils import set_cuda_backend # noqa
from ._video_decoder import VideoDecoder # noqa
from ._video_decoder import FallbackInfo, VideoDecoder # noqa

SimpleVideoDecoder = VideoDecoder
78 changes: 78 additions & 0 deletions src/torchcodec/decoders/_video_decoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import io
import json
import numbers
from dataclasses import dataclass
from pathlib import Path
from typing import List, Literal, Optional, Sequence, Tuple, Union

Expand All @@ -22,6 +23,47 @@
from torchcodec.transforms import DecoderTransform, Resize


@dataclass
class FallbackInfo:
Copy link
Contributor

@NicolasHug NicolasHug Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could keep this class private for now, but I was thinking, maybe it could be public and named CpuFallbackStatus? I'm hoping that it might resolve both @scotts preference and mine:

The CpuFallbackStatus class will be public and visible from the docs, and it will be clear from the docstring that the dec.cpu_fallback attribute is an instance of this class. The attribute name cpu_fallback is fairly generic and allows us to extend the functionality in the future, while the class name itself communicates that it it's not just a simple bool, or a simple string.

"""Information about decoder fallback status.

This class tracks whether the decoder fell back to CPU decoding.

Usage:
- Use ``str(fallback_info)`` or ``print(fallback_info)`` to see the cpu fallback status
- Use ``bool(fallback_info)`` to check if any fallback occurred

Attributes:
status_known (bool): Whether the fallback status has been determined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's OK to expose publicly. Let's just document that:

  • for the Beta CUDA backend this is always known
  • for the ffmpeg one, it's known after decoding the first frame.

We can link to this concept of CUDA backend by linking to https://meta-pytorch.org/torchcodec/stable/generated/torchcodec.decoders.set_cuda_backend.html#torchcodec.decoders.set_cuda_backend

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we'll probably want to document this class publicly in docs/source/api_ref_decoders.rst. We should indicate that users should never instantiate this class directly, and only accessed via the VideoDecoder.cpu_fallback attribute.

"""

def __init__(self):
self.status_known = False
self.__nvcuvid_unavailable = False
self.__video_not_supported = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using double underscores works, it's the more "hardcore" version to indicate that something is private. (ref). We don't use double underscores in TC, we've mostly just used single underscore up to now. For consistency, I think it's best to stick to our current practice of using single undercores.

EDIT: see my other comment below, the name mangling is a bit surprising, so let's definitely use single underscores


def __bool__(self):
"""Returns True if fallback occurred."""
return self.status_known and (
self.__nvcuvid_unavailable or self.__video_not_supported
)

def __str__(self):
"""Returns a human-readable string representation of the cpu fallback status."""
if not self.status_known:
return "Fallback status: Unknown"

reasons = []
if self.__nvcuvid_unavailable:
reasons.append("NVcuvid unavailable")
if self.__video_not_supported:
reasons.append("Video not supported")

if reasons:
return "Fallback status: Falling back due to: " + ", ".join(reasons)
return "Fallback status: No fallback required"


class VideoDecoder:
"""A single-stream video decoder.

Expand Down Expand Up @@ -180,13 +222,42 @@ def __init__(
custom_frame_mappings=custom_frame_mappings_data,
)

self._fallback_info = FallbackInfo()
self._has_decoded_frame = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of tracking whether a frame has been decoded, we could also just call _update_cpu_fallback inside every method that would decode a frame. The problem with that approach would be that we would be returning non tensors in the compiled methods. As such, we would have to do something like inside

if torch.compiler.is_compiling():
    return

Copy link
Contributor

@NicolasHug NicolasHug Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should implement this _has_decoded_frame logic in C++ for 2 reasons:

  • it's simpler and can be localized to a single place (see suggested implementation below)
  • it's only relevant for the default interface, not for the beta interface.

So maybe all we need is to update the default interface's returned string, and let it indicate whether its status is known or unknown yet. Here:

std::string CudaDeviceInterface::getDetails() {
// Note: for this interface specifically the fallback is only known after a
// frame has been decoded, not before: that's when FFmpeg decides to fallback,
// so we can't know earlier.
return std::string("FFmpeg CUDA Device Interface. Using ") +
(usingCPUFallback_ ? "CPU fallback." : "NVDEC.");
}

To know whether a frame has been decoded yet, I think we can simply set a boolean field to true when

void CudaDeviceInterface::convertAVFrameToFrameOutput(

is called. This would be a new private boolean attribute on the CudaDeviceInterface class.


def __len__(self) -> int:
return self._num_frames

@property
def cpu_fallback(self) -> FallbackInfo:
# We can only determine whether fallback to CPU is happening when this
# property is accessed and requires that at least one frame has been decoded.
Comment on lines +244 to +245
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is slightly misleading because it's only really true for the ffmpeg interface. Can I suggest the following - and also please check me on my understanding here:

We only query the CPU fallback info if status is unknown. That happens either when:

  • this @Property has never been called before
  • no frame has been decoded yet on the FFmpeg interface.

Note that for the beta interface, we're able to know the fallback status right when the VideoDecoder
is instantiated, but the status_known attribute is initialized to False.

self._update_cpu_fallback()
return self._fallback_info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 nits:

  • No need to define a separate _update_cpu_fallback() function, it can just be inlined here
  • The convention (I think it's a convention??) is to use the same name for the @property and for the underlying cached object. That is, I think self._fallback_info should just be self._cpu_fallback. It makes it more obvious that it relates to the @cpu_fallback property.


def _update_cpu_fallback(self):
"""Update the fallback status if it hasn't been determined yet.

This method queries the C++ backend to determine if fallback to CPU
decoding occurred. The query is only performed after at least one frame
has been decoded.
"""
if not self._fallback_info.status_known and self._has_decoded_frame:
backend_details = core._get_backend_details(self._decoder)

self._fallback_info.status_known = True

if "CPU fallback" in backend_details:
if "NVCUVID not available" in backend_details:
self._fallback_info._FallbackInfo__nvcuvid_unavailable = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so this _FallbackInfo__nvcuvid_unavailable is the name-mangling consequence of using double leading underscore. Let's definitely use single underscores :)

else:
self._fallback_info._FallbackInfo__video_not_supported = True

def _getitem_int(self, key: int) -> Tensor:
assert isinstance(key, int)

frame_data, *_ = core.get_frame_at_index(self._decoder, frame_index=key)
self._has_decoded_frame = True
return frame_data

def _getitem_slice(self, key: slice) -> Tensor:
Expand All @@ -199,6 +270,7 @@ def _getitem_slice(self, key: slice) -> Tensor:
stop=stop,
step=step,
)
self._has_decoded_frame = True
return frame_data

def __getitem__(self, key: Union[numbers.Integral, slice]) -> Tensor:
Expand Down Expand Up @@ -252,6 +324,7 @@ def get_frame_at(self, index: int) -> Frame:
data, pts_seconds, duration_seconds = core.get_frame_at_index(
self._decoder, frame_index=index
)
self._has_decoded_frame = True
return Frame(
data=data,
pts_seconds=pts_seconds.item(),
Expand All @@ -271,6 +344,7 @@ def get_frames_at(self, indices: Union[torch.Tensor, list[int]]) -> FrameBatch:
data, pts_seconds, duration_seconds = core.get_frames_at_indices(
self._decoder, frame_indices=indices
)
self._has_decoded_frame = True

return FrameBatch(
data=data,
Expand Down Expand Up @@ -300,6 +374,7 @@ def get_frames_in_range(self, start: int, stop: int, step: int = 1) -> FrameBatc
stop=stop,
step=step,
)
self._has_decoded_frame = True
return FrameBatch(*frames)

def get_frame_played_at(self, seconds: float) -> Frame:
Expand Down Expand Up @@ -329,6 +404,7 @@ def get_frame_played_at(self, seconds: float) -> Frame:
data, pts_seconds, duration_seconds = core.get_frame_at_pts(
self._decoder, seconds
)
self._has_decoded_frame = True
return Frame(
data=data,
pts_seconds=pts_seconds.item(),
Expand All @@ -350,6 +426,7 @@ def get_frames_played_at(
data, pts_seconds, duration_seconds = core.get_frames_by_pts(
self._decoder, timestamps=seconds
)
self._has_decoded_frame = True
return FrameBatch(
data=data,
pts_seconds=pts_seconds,
Expand Down Expand Up @@ -394,6 +471,7 @@ def get_frames_played_in_range(
start_seconds=start_seconds,
stop_seconds=stop_seconds,
)
self._has_decoded_frame = True
return FrameBatch(*frames)


Expand Down
86 changes: 86 additions & 0 deletions test/test_decoders.py
Original file line number Diff line number Diff line change
Expand Up @@ -1737,6 +1737,92 @@ def test_set_cuda_backend(self):
with set_cuda_backend(backend):
VideoDecoder(H265_VIDEO.path, device=f"cuda:{bad_device_number}")

def test_cpu_fallback_before_after_decoding(self):
decoder = VideoDecoder(NASA_VIDEO.path)

# Before accessing any frames, status should be unknown
assert not decoder.cpu_fallback.status_known
assert str(decoder.cpu_fallback) == "Fallback status: Unknown"
assert not bool(decoder.cpu_fallback)

# After accessing frames, status should be known
_ = decoder[0]
assert decoder.cpu_fallback.status_known
assert str(decoder.cpu_fallback) != "Fallback status: Unknown"

def test_cpu_fallback_no_fallback_on_cpu_device(self):
"""Test that CPU device doesn't trigger fallback (it's not a fallback scenario)."""
decoder = VideoDecoder(NASA_VIDEO.path, device="cpu")

_ = decoder[0]

assert decoder.cpu_fallback.status_known
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this scenario, we should be able to assert that the status is known before we decode any frame?

assert not bool(decoder.cpu_fallback)
assert "No fallback required" in str(decoder.cpu_fallback)

@needs_cuda
def test_cpu_fallback_h265_video_ffmpeg_cuda(self):
"""Test that H265 video triggers CPU fallback on FFmpeg CUDA interface."""
# H265_VIDEO is known to trigger CPU fallback on FFmpeg CUDA
# because its dimensions are too small
decoder = VideoDecoder(H265_VIDEO.path, device="cuda")

_ = decoder.get_frame_at(0)

assert decoder.cpu_fallback.status_known
assert bool(decoder.cpu_fallback)
assert "Fallback status: Falling back due to:" in str(decoder.cpu_fallback)

@needs_cuda
def test_cpu_fallback_h265_video_beta_cuda(self):
"""Test that H265 video triggers CPU fallback on Beta CUDA interface."""
with set_cuda_backend("beta"):
decoder = VideoDecoder(H265_VIDEO.path, device="cuda")

_ = decoder.get_frame_at(0)

assert decoder.cpu_fallback.status_known
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the beta interface, we should be able to assert that status_known is true before we even decode any frame. I think you might need some slight modification to the implementation above in order to achieve that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping on this, we seem to not have tests for the beta interface now?

assert bool(decoder.cpu_fallback)
assert "Fallback status: Falling back due to:" in str(decoder.cpu_fallback)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's assert that it's due to the video being not supported


@needs_cuda
def test_cpu_fallback_no_fallback_on_supported_video(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's parametrize this over both interfaces - the beta and ffpmeg ones

"""Test that supported videos don't trigger fallback on CUDA."""
decoder = VideoDecoder(NASA_VIDEO.path, device="cuda")

_ = decoder[0]

assert not bool(decoder.cpu_fallback)
assert "No fallback required" in str(decoder.cpu_fallback)

def test_cpu_fallback_status_cached(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this test mostly tests that the output value doesn't change, not that it's cached. I think testing the cache behavior is potentially really difficult, and perhaps not needed after all. I'd suggest to remove it?

"""Test that cpu_fallback status is determined once and then cached."""
decoder = VideoDecoder(NASA_VIDEO.path)

_ = decoder[0]
first_status = str(decoder.cpu_fallback)
assert decoder.cpu_fallback.status_known

_ = decoder[1]
second_status = str(decoder.cpu_fallback)
assert decoder.cpu_fallback.status_known

assert first_status == second_status

def test_cpu_fallback_multiple_access_methods(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is technically a subset of the previous one: if we 'cache' the cpu_fallback result, then a consequence is that calling different methods isn't going to change it.

Maybe want you wanted to check is that the status becomes "known" and that it works with multiple decoding method? If that's the case I think we'd need to re-create the VideoDecoder object in-between method calls. But TBH, I'm not sure it's a critical test to have, so I might suggest to remove it too.

"""Test that cpu_fallback works with different frame access methods."""
decoder = VideoDecoder(NASA_VIDEO.path)

_ = decoder.get_frame_at(0)
assert decoder.cpu_fallback.status_known
status_after_get_frame = str(decoder.cpu_fallback)

_ = decoder.get_frames_in_range(1, 3)
assert str(decoder.cpu_fallback) == status_after_get_frame

_ = decoder.get_frame_played_at(0.5)
assert str(decoder.cpu_fallback) == status_after_get_frame


class TestAudioDecoder:
@pytest.mark.parametrize("asset", (NASA_AUDIO, NASA_AUDIO_MP3, SINE_MONO_S32))
Expand Down
Loading