-
Notifications
You must be signed in to change notification settings - Fork 74
Expose cpu_fallback in VideoDecoder API #1093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| ) | ||
|
|
||
| self._fallback_info = FallbackInfo() | ||
| self._has_decoded_frame = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of tracking whether a frame has been decoded, we could also just call _update_cpu_fallback inside every method that would decode a frame. The problem with that approach would be that we would be returning non tensors in the compiled methods. As such, we would have to do something like inside
if torch.compiler.is_compiling():
return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should implement this _has_decoded_frame logic in C++ for 2 reasons:
- it's simpler and can be localized to a single place (see suggested implementation below)
- it's only relevant for the default interface, not for the beta interface.
So maybe all we need is to update the default interface's returned string, and let it indicate whether its status is known or unknown yet. Here:
torchcodec/src/torchcodec/_core/CudaDeviceInterface.cpp
Lines 357 to 363 in 38fa96c
| std::string CudaDeviceInterface::getDetails() { | |
| // Note: for this interface specifically the fallback is only known after a | |
| // frame has been decoded, not before: that's when FFmpeg decides to fallback, | |
| // so we can't know earlier. | |
| return std::string("FFmpeg CUDA Device Interface. Using ") + | |
| (usingCPUFallback_ ? "CPU fallback." : "NVDEC."); | |
| } |
To know whether a frame has been decoded yet, I think we can simply set a boolean field to true when
| void CudaDeviceInterface::convertAVFrameToFrameOutput( |
is called. This would be a new private boolean attribute on the CudaDeviceInterface class.
NicolasHug
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @mollyxu ! Left a few comments and suggestions regarding the implementation, let me know if I can clarify anything.
|
|
||
|
|
||
| @dataclass | ||
| class FallbackInfo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could keep this class private for now, but I was thinking, maybe it could be public and named CpuFallbackStatus? I'm hoping that it might resolve both @scotts preference and mine:
The CpuFallbackStatus class will be public and visible from the docs, and it will be clear from the docstring that the dec.cpu_fallback attribute is an instance of this class. The attribute name cpu_fallback is fairly generic and allows us to extend the functionality in the future, while the class name itself communicates that it it's not just a simple bool, or a simple string.
| self.__nvcuvid_unavailable = False | ||
| self.__video_not_supported = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using double underscores works, it's the more "hardcore" version to indicate that something is private. (ref). We don't use double underscores in TC, we've mostly just used single underscore up to now. For consistency, I think it's best to stick to our current practice of using single undercores.
EDIT: see my other comment below, the name mangling is a bit surprising, so let's definitely use single underscores
| self._update_cpu_fallback() | ||
| return self._fallback_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 nits:
- No need to define a separate
_update_cpu_fallback()function, it can just be inlined here - The convention (I think it's a convention??) is to use the same name for the
@propertyand for the underlying cached object. That is, I thinkself._fallback_infoshould just beself._cpu_fallback. It makes it more obvious that it relates to the@cpu_fallbackproperty.
|
|
||
| if "CPU fallback" in backend_details: | ||
| if "NVCUVID not available" in backend_details: | ||
| self._fallback_info._FallbackInfo__nvcuvid_unavailable = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so this _FallbackInfo__nvcuvid_unavailable is the name-mangling consequence of using double leading underscore. Let's definitely use single underscores :)
| ) | ||
|
|
||
| self._fallback_info = FallbackInfo() | ||
| self._has_decoded_frame = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should implement this _has_decoded_frame logic in C++ for 2 reasons:
- it's simpler and can be localized to a single place (see suggested implementation below)
- it's only relevant for the default interface, not for the beta interface.
So maybe all we need is to update the default interface's returned string, and let it indicate whether its status is known or unknown yet. Here:
torchcodec/src/torchcodec/_core/CudaDeviceInterface.cpp
Lines 357 to 363 in 38fa96c
| std::string CudaDeviceInterface::getDetails() { | |
| // Note: for this interface specifically the fallback is only known after a | |
| // frame has been decoded, not before: that's when FFmpeg decides to fallback, | |
| // so we can't know earlier. | |
| return std::string("FFmpeg CUDA Device Interface. Using ") + | |
| (usingCPUFallback_ ? "CPU fallback." : "NVDEC."); | |
| } |
To know whether a frame has been decoded yet, I think we can simply set a boolean field to true when
| void CudaDeviceInterface::convertAVFrameToFrameOutput( |
is called. This would be a new private boolean attribute on the CudaDeviceInterface class.
|
|
||
| _ = decoder.get_frame_at(0) | ||
|
|
||
| assert decoder.cpu_fallback.status_known |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the beta interface, we should be able to assert that status_known is true before we even decode any frame. I think you might need some slight modification to the implementation above in order to achieve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping on this, we seem to not have tests for the beta interface now?
| - Use ``bool(cpu_fallback_status)`` to check if any fallback occurred | ||
| Attributes: | ||
| status_known (bool): Whether the fallback status has been determined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's OK to expose publicly. Let's just document that:
- for the Beta CUDA backend this is always known
- for the ffmpeg one, it's known after decoding the first frame.
We can link to this concept of CUDA backend by linking to https://meta-pytorch.org/torchcodec/stable/generated/torchcodec.decoders.set_cuda_backend.html#torchcodec.decoders.set_cuda_backend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we'll probably want to document this class publicly in docs/source/api_ref_decoders.rst. We should indicate that users should never instantiate this class directly, and only accessed via the VideoDecoder.cpu_fallback attribute.
|
|
||
| assert "FFmpeg CUDA" in str(ref_dec.cpu_fallback) | ||
| assert ref_dec.cpu_fallback.status_known | ||
| assert bool(ref_dec.cpu_fallback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and everywhere else, we don't need to explicitly call bool(). The Pythonic way is to rely on the fact that bool() is essentially called automatically in contexts where it's needed (it's not necessarily how it's implemented in CPython but the idea is the same). This includes:
assert condif cond- etc.
For example, to check whether a list is empty, the Pythonic way is to just check if l: .... We don't do if bool(l):
| _ = dec.get_frame_at(0) | ||
| assert "FFmpeg CUDA" in str(dec.cpu_fallback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 comments:
- Syntax: we don't need to use
_ = .... Just call a rawdec.get_frame_at(0) - We shouldn't need to decode a frame here, because we should already be able to tell that it's the FFmpeg backend. We don't know whether there's a fallback, but that's not needed info here.
|
|
||
| _ = decoder[0] | ||
|
|
||
| assert decoder.cpu_fallback.status_known |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this scenario, we should be able to assert that the status is known before we decode any frame?
|
|
||
| _ = decoder.get_frame_at(0) | ||
|
|
||
| assert decoder.cpu_fallback.status_known |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping on this, we seem to not have tests for the beta interface now?
| assert "Fallback status: Falling back due to:" in str(decoder.cpu_fallback) | ||
|
|
||
| @needs_cuda | ||
| def test_cpu_fallback_no_fallback_on_supported_video(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's parametrize this over both interfaces - the beta and ffpmeg ones
| assert not bool(decoder.cpu_fallback) | ||
| assert "No fallback required" in str(decoder.cpu_fallback) | ||
|
|
||
| def test_cpu_fallback_status_cached(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this test mostly tests that the output value doesn't change, not that it's cached. I think testing the cache behavior is potentially really difficult, and perhaps not needed after all. I'd suggest to remove it?
|
|
||
| assert first_status == second_status | ||
|
|
||
| def test_cpu_fallback_multiple_access_methods(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test is technically a subset of the previous one: if we 'cache' the cpu_fallback result, then a consequence is that calling different methods isn't going to change it.
Maybe want you wanted to check is that the status becomes "known" and that it works with multiple decoding method? If that's the case I think we'd need to re-create the VideoDecoder object in-between method calls. But TBH, I'm not sure it's a critical test to have, so I might suggest to remove it too.
| # We can only determine whether fallback to CPU is happening when this | ||
| # property is accessed and requires that at least one frame has been decoded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is slightly misleading because it's only really true for the ffmpeg interface. Can I suggest the following - and also please check me on my understanding here:
We only query the CPU fallback info if status is unknown. That happens either when:
- this @Property has never been called before
- no frame has been decoded yet on the FFmpeg interface.
Note that for the beta interface, we're able to know the fallback status right when the VideoDecoder
is instantiated, but thestatus_knownattribute is initialized to False.
Expose cpu_fallback in VideoDecoder API
Expose the
cpu_fallbackproperty in the public API to allow users to check whether the video decoder fell back to CPU decoding.The
FallbackInfoclass provides a clean interface for users to check decoder fallback status:bool(fallback_info)returns True if any fallback occurredstr(fallback_info)provides a human-readable explanation of why fallback happenedThe
cpu_fallbackproperty onVideoDecoderreturns aFallbackInfoinstance that can be queried after decoding at least one frame. This helps users understand why GPU decoding may not be used and debug performance issues.