Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 50 additions & 4 deletions src/torchcodec/_core/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,56 @@ def load_torchcodec_shared_libraries() -> tuple[int, str]:
pybind_ops_module_name, pybind_ops_library_path
)
return ffmpeg_major_version, core_library_path
except Exception:
# Capture the full traceback for this exception
exc_traceback = traceback.format_exc()
exceptions.append((ffmpeg_major_version, exc_traceback))
except Exception as e:
# Below: all this block is just about trying to provide a more
# condensed and more informative exception message to the user,
# instead of dumping the entire traceback for each FFmpeg version,
# which would be too verbose.
# TODO: we really need a decent log system with different verbosity
# levels instead of this.
if isinstance(e, ImportError) and "No spec found for libtorchcodec" in str(
e
):
# This should only happen when building from source for a single
# target FFmpeg version.
exceptions.append(
(
ffmpeg_major_version,
f"Could not find one of the libtorchcodec* libraries, probably because TorchCodec wasn't built for FFmpeg {ffmpeg_major_version}.\n",
)
)
else:
full_traceback_str = traceback.format_exc()
# If we get something like this:
# OSError: Could not load this library: [...]torchcodec/src/torchcodec/libtorchcodec_core6.so
# Then in the traceback we try to find a line like this:
# OSError: libavcodec.so.60: cannot open shared object file: No such file or directory
# which should robustly indicate that the corresponding FFmpeg
# version is just not installed, or can't be found.
missing_ffmpeg_lib = next(
(
line.strip()
for line in full_traceback_str.splitlines()
if "libav" in line and "No such file or directory" in line
),
None,
)
if (
isinstance(e, OSError)
and ("Could not load this library") in str(e)
and "libtorchcodec" in (str(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To define missing_ffmpeg_lib above, we search the error message line by line, but here we are searching the entire string for the two elements of the error message. Is there some reason for that?

It seems Could not load this library and libtorchcodec appear on the same line, so reusing the pattern might help to clarify the conditions we check to add a condensed error message.

...
    raise OSError(f"Could not load this library: {path}") from e
OSError: Could not load this library: /home/nicolashug/.opt/miniconda3/envs/codec/lib/python3.11/site-packages/torchcodec/libtorchcodec_core6.so

and missing_ffmpeg_lib
):
exceptions.append(
(
ffmpeg_major_version,
f"Got the following exception: {missing_ffmpeg_lib}\n"
f"FFmpeg version {ffmpeg_major_version} is likely not installed or its libraries cannot be found on this system.\n",
)
)
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you able to test any import fails to hit this else case?

I do not recall the particulars of the stack trace, but if a stack trace begins with an OSError, but later lists a root cause after The above exception was the direct cause of the following exception, we will append the condensed OSError exception instead of the potentially more informative root cause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the scenario you mention above where we have an OSError and then an another exception stacked on top of it, we will only condense the error message if that other exception is of the form

OSError: libavcodec.so.60: cannot open shared object file: No such file or directory

i.e. only if it directly relates to libav* libraries.

In other words, we're only condensing this

FFmpeg version 7:
Traceback (most recent call last):
  File "/home/nicolashug/.opt/miniconda3/envs/codec/lib/python3.11/site-packages/torch/_ops.py", line 1487, in load_library
    ctypes.CDLL(path)
  File "/home/nicolashug/.opt/miniconda3/envs/codec/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libavcodec.so.61: cannot open shared object file: No such file or directory

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/nicolashug/.opt/miniconda3/envs/codec/lib/python3.11/site-packages/torchcodec/_core/ops.py", line 57, in load_torchcodec_shared_libraries
    torch.ops.load_library(core_library_path)
  File "/home/nicolashug/.opt/miniconda3/envs/codec/lib/python3.11/site-packages/torch/_ops.py", line 1489, in load_library
    raise OSError(f"Could not load this library: {path}") from e
OSError: Could not load this library: /home/nicolashug/.opt/miniconda3/envs/codec/lib/python3.11/site-packages/torchcodec/libtorchcodec_core7.so

into this:

FFmpeg version 7:
Got the following exception: OSError: libavcodec.so.61: cannot open shared object file: No such file or directory
FFmpeg version 7 is likely not installed or its libraries cannot be found on this system.

Any other import error will not be condensed and will hit this else branch, showing the full logs.

# We can't identify the issue, so we just return the full traceback.
exceptions.append((ffmpeg_major_version, full_traceback_str))

traceback_info = (
"\n[start of libtorchcodec loading traceback]\n"
Expand Down
Loading