Transforms bridge between Python and C++ #948

scotts · 2025-10-08T20:37:14Z

Next step after #902. The design in #885 punted on how the Python layer would communicate the transforms and their parameters to the C++ layer. This PR answers that question: a string. The string format is:

"name1, param1, param2, ...; name2, param1, param2, ...; name3, param1, param2, ..."

In the above, nameX is the name of a transform, and paramX are the parameters that transform accepts. For example, the only transform that we have now is resize, and its spec is currently:

"resize, <height>, <width>"

Where resize is literally what we expect, and <height> and <width> are integers that will become the height and width. In the future we will add a third parameter for algorithm. Future transforms will take potentially different number of parameters with different types; we'll define exactly what the spec for each transform is when we add it.

I don't love that we're using strings with our own little specification language, but I'm convinced this is the least bad option:

It's possible to use tensors, but it would be uglier and more esoteric. Because tensors are limited to number types, we'd have to map numbers to transform kind. For example, we could say that 0 -> resize, and then if we wanted to specify a resize operation of height 1024 and width 768, we could say torch.tensor([0, 1024, 768]). But both the Python and C++ side would need to know this mapping of integer to transform. Yes, that's technically true with strings, but it's rather obvious what "resize" means. The machinery required for this approach is even more than what's required to accept our little string spec language.
JSON is overkill as we have a constrained input. I'd rather not parse full JSON.
Users aren't actually exposed to this specification language. It exists only on the core API. The VideoDecoder class will be responsible for translating from torchvision.transforms.v2 to these specification strings. Since it's our own code that will generate these specs, we don't need to worry about making something with sharp edges that will cut users.

scotts · 2025-10-09T03:04:27Z

src/torchcodec/_core/custom_ops.cpp

    std::optional<int64_t> stream_index = std::nullopt,
    std::string_view device = "cpu",
    std::string_view device_variant = "default",
+    std::string_view transform_specs = "",


Note that we're using an empty spec, "", as the default rather than making it optional. I find this makes the code simpler and easier to reason about.

scotts · 2025-10-09T03:05:29Z

src/torchcodec/_core/custom_ops.cpp

  videoStreamOptions.deviceVariant = device_variant;

+  std::vector<Transform*> transforms =
+      makeTransforms(std::string(transform_specs));


An example of how using a default empty spec make things simpler than using an optional: we always call this function. If we have an empty spec, we just get back an empty vector.

Transforms bridge between Python and C++

c8d546e

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 8, 2025

scotts added 3 commits October 8, 2025 18:51

Better error checking

7143f15

Fidx GPU benchmark

a964447

Better comments

3db7aaa

scotts commented Oct 9, 2025

View reviewed changes

scotts marked this pull request as ready for review October 9, 2025 03:05

NicolasHug approved these changes Oct 10, 2025

View reviewed changes

scotts merged commit e5b2eef into meta-pytorch:main Oct 10, 2025
50 checks passed

scotts deleted the transforms_bridge branch October 10, 2025 14:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transforms bridge between Python and C++ #948

Transforms bridge between Python and C++ #948

Uh oh!

scotts commented Oct 8, 2025 •

edited

Loading

Uh oh!

scotts Oct 9, 2025

Uh oh!

scotts Oct 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Transforms bridge between Python and C++ #948

Transforms bridge between Python and C++ #948

Uh oh!

Conversation

scotts commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scotts Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

scotts Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

scotts commented Oct 8, 2025 •

edited

Loading