Implement RandomCrop transform #1070

scotts · 2025-11-21T03:35:14Z

Implements torchcodec.transforms.RandomCrop and also accepts torchvision.transforms.v2.RandomCrop. The key difference between this capability and Resize is that we need to:

Compute a random location in the image to crop.
And it must match exactly what TorchVision does.

Short version of how we accomplish this:

If you give us the TorchVision object, we call make_params() on it to get the computed location.
If you don't, we do the same calculation in TorchCodec. We'll need to use testing and code review to make sure these stay aligned.

Working on this transform also made me realize that DecoderTransform and its subclasses are not dataclasses. I initially thought they would just be bags of values. But they're growing to have significant methods and internal state not exposed to users. In a follow-up PR, I'll refactor these into normal classes, much like the TorchVision versions. I felt that was too disruptive to do in this PR.

scotts · 2025-11-22T03:04:33Z

src/torchcodec/_core/custom_ops.cpp

-  int x = checkedToPositiveInt(cropTransformSpec[3]);
-  int y = checkedToPositiveInt(cropTransformSpec[4]);
+  int x = checkedToNonNegativeInt(cropTransformSpec[3]);
+  int y = checkedToNonNegativeInt(cropTransformSpec[4]);


The location (0, 0) is a valid image location. 🤦

Dan-Flores · 2025-11-24T16:53:17Z

src/torchcodec/transforms/_decoder_transforms.py

+        if self._top is None or self._left is None:
+            # TODO: It would be very strange if only ONE of those is None. But should we
+            #       make it an error? We can continue, but it would probably mean
+            #       something bad happened. Dear reviewer, please register an opinion here:


I agree it would appear something bad happened in this case.

But when calling this function, do we expect _top or _left to have any value? My understanding is that these fields are only set when _make_transform_spec is called, which is only called once per DecoderTransform.

It depends on if RandomCrop was created via a TorchVision RandomCrop or instantiated directly. If created by a TorchVision RandomCrop in _from_torchvision(), then _top and _left should have values. If created directly, then both should not have values, in which case we have to do our random logic.

It has occurred to me that maybe we don't need to call RandomCrop.make_params() in _from_torchvision(). Maybe we should just always set these values in _make_transform_spec().

Dan-Flores · 2025-11-24T16:53:25Z

src/torchcodec/decoders/_video_decoder.py

                    "v2 transform."
                )
        else:
+            input_dims = transform._get_output_dims(input_dims)


Is _get_output_dims is only used in this function for validation? I believe there are TODOs to move validation to the constructor which is great, but I do not understand the the returned value input_dims here.

It's not actually used for validation. Think of the transforms as a pipeline: A -> B -> C -> D. Each stage may change the dimensions of the frame. We need to track the frame dimensions as we move through the pipeline because some transforms need to know the dimensions of the input frame. RandomCrop is one such transform: in order to randomly determine a location to crop, it needs to know the input frame dimensions to know the bounds to pass to the random number generator.

The dimensions that A receives come from the originally decoded frame, which we can get from the metadata. But the dimensions for B are actually the output of A! That extends to each transform in the pipeline.

This probably deserves a comment. :)

scotts added 3 commits November 17, 2025 06:17

Committing to move on

a8a8cea

Merge branch 'main' of github.com:pytorch/torchcodec into random_crop

af2e1ab

It... works?

aa15765

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 21, 2025

scotts added 3 commits November 20, 2025 19:37

Lint

fd8f7a5

Docstrings, better error checking, better testing

7e43313

Way more defensive programming

8e6a8f2

scotts commented Nov 22, 2025

View reviewed changes

scotts marked this pull request as ready for review November 22, 2025 03:05

scotts changed the title ~~[WIP] Implement RandomCrop transform~~ Implement RandomCrop transform Nov 22, 2025

Dan-Flores reviewed Nov 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement RandomCrop transform #1070

Implement RandomCrop transform #1070

scotts commented Nov 21, 2025 •

edited

Loading

Uh oh!

scotts Nov 22, 2025

Uh oh!

Dan-Flores Nov 24, 2025

Uh oh!

scotts Nov 25, 2025

Uh oh!

Dan-Flores Nov 24, 2025

Uh oh!

scotts Nov 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Implement RandomCrop transform #1070

Are you sure you want to change the base?

Implement RandomCrop transform #1070

Conversation

scotts commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scotts Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

scotts Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

scotts Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

scotts commented Nov 21, 2025 •

edited

Loading

scotts Nov 25, 2025 •

edited

Loading