Add FPS to `get_frames_played_in_range()` #1148

mollyxu · 2025-12-30T18:03:08Z

Add FPS to `get_frames_played_in_range()`

Context: #1133

This PR adds support for specifying a target frame rate when calling get_frames_played_in_range(). When target_fps is provided, the method resamples the video to output frames at evenly-spaced timestamps matching the target FPS, mimicking FFmpeg fps filter behavior.

def get_frames_played_in_range(
    self,
    start_seconds: float,
    stop_seconds: float,
    fps: float | None = None,
) -> FrameBatch:

The way to read this API if fps is:

None, default: returns decoded frames between start_seconds and stop_seconds at the input frame rate. This is the existing behavior.
float: returns decoded frames between start_seconds and stop_seconds where the output frame rate matches the value of fps.

Implementation

The resampling algorithm replicates FFmpeg's fps filter logic:

Frame count calculation: Uses round((stop - start) * target_fps) to determine output frame count, matching FFmpeg's AV_ROUND_NEAR_INF rounding behavior.
Frame mapping: For each source frame, computes output_index = round((pts - start) * target_fps). Multiple source frames mapping to the same output index results in the last one being used.
Gap filling: When upsampling (target FPS > source FPS), empty output slots use the previous valid frame (the frame that would still be "displayed" at that timestamp).
Decode optimization: Consecutive output frames using the same source frame are handled by copying the already-decoded tensor, avoiding redundant decodes.

mollyxu · 2025-12-30T18:06:16Z

src/torchcodec/_core/SingleStreamDecoder.cpp

            std::to_string(maxSeconds.value()) + ").");
  }

+  // Resample frames to match the target frame rate


We could add an early break if requested fps is the same as the current fps. Not sure if it is a necessary use case

Let's handle that with a follow-up. Because we'll need to do equality on a double, there may be some subtle things to handle.

scotts · 2026-01-05T18:13:10Z

src/torchcodec/_core/SingleStreamDecoder.cpp

    double startSeconds,
-    double stopSeconds) {
+    double stopSeconds,
+    std::optional<double> fps) {


I have a slight preference for SingleStreamDecoder to not take optionals, and for the bridge code in custom_ops.cpp to take care of optional logic. But, there are other APIs in SingleStreamDecoder which do take optionals, I'll invite @Dan-Flores and @NicolasHug to make a call here.

It seems natural to me to have an optional here. Letting custom_ops.cpp be the bridge, I think, would mean that we'd have to create more private method to split the implementation (for e.g. the common checks at the beginning, and the common logic at the end).

scotts · 2026-01-05T18:14:44Z

src/torchcodec/_core/SingleStreamDecoder.cpp

+        fps.value() > 0,
+        "fps must be positive, got " + std::to_string(fps.value()));
+
+    double fpsVal = static_cast<double>(fps.value());


Nit: no need to do a cast, as fps.value() returns a double.

scotts · 2026-01-05T18:19:39Z

src/torchcodec/_core/SingleStreamDecoder.cpp

+    // determine the frame count, which we replicate here for consistency.
+    // Examples:
+    // - 35.23 fps, 1.0s: round(35.23) = 35 frames
+    // - 35.87 fps, 1.0s: round(35.87) = 36 frames


Can you put a link to the resource you learned the exact techniques that the fps filter is using? I assume it's the source code, in which case a stable link to that code would be great. We want to make it easy for others to make changes to this code, and they may need to look at that as well.

scotts · 2026-01-05T18:48:35Z

src/torchcodec/_core/SingleStreamDecoder.cpp

+          videoStreamOptions.device);
+      frameBatchOutput.data = maybePermuteHWC2CHW(frameBatchOutput.data);
+      return frameBatchOutput;
+    }


Is it ever valid for numOutputFrames < 0? That is, we ensure that startSeconds <= stopSeconds at the very top, and we also ensure that fpsVal > 0. So I think that if numOutputFrames < 0, that is an error.

And in the case that numOutputFrames == 0, I don't think we actually need to special case it. I think the remaining logic should be robust to that case.

scotts · 2026-01-05T18:56:04Z

src/torchcodec/_core/SingleStreamDecoder.cpp

+      if (outputIdx >= 0) {
+        sourceFrameIndices[outputIdx] = j;
+      }
+    }


This mapping logic will need to be seek mode aware. In approximate mode, streamInfo.allFrames won't be available and we'll need to do math to calculate the value. Check out secondsToIndexLowerBound() for inspiration.

We'll also need to test exact and approximate seek modes. :)

scotts · 2026-01-05T18:57:30Z

test/test_decoders.py

        with pytest.raises(ValueError, match="Invalid stop seconds"):
            frame = decoder.get_frames_played_in_range(0, 23)  # noqa

+    @pytest.mark.parametrize("device", all_supported_devices())


Also parameterize on seek mode - I think we can skip custom and just do "exact" and "approximate". Based on the code above, I think "approximate" will currently fail.

scotts · 2026-01-05T18:57:57Z

test/test_decoders.py

+        with pytest.raises(RuntimeError, match="fps must be positive"):
+            decoder.get_frames_played_in_range(start_seconds, stop_seconds, fps=-10)
+
+    @needs_ffmpeg_cli


Let's also parametrize this test on seek mode.

scotts · 2026-01-05T19:17:48Z

test/test_decoders.py

+        )
+
+        source_fps = decoder.metadata.average_fps
+        duration_seconds = 1.0


We'll need some tests that extract a longer duration. In particular, we'll want to make sure we have tests that decode the entire video. I suspect we'll hit some edge-cases in our dropping and duping logic when we decode all frames.

NicolasHug · 2026-01-06T10:59:06Z

src/torchcodec/decoders/_video_decoder.py


    def get_frames_played_in_range(
-        self, start_seconds: float, stop_seconds: float
+        self, start_seconds: float, stop_seconds: float, fps: float | None = None


If we expose the fps functionality in this method, then for convenience I think we should make sure to also expose a get_all_frames(fps) method, so that users don't have to manually specify both start and stop. (Exposing this get_all_frames() method is something we should do regardless).

NicolasHug · 2026-01-06T11:29:20Z

src/torchcodec/_core/SingleStreamDecoder.cpp

+    for (int64_t i = 0; i < numOutputFrames; ++i) {
+      double targetPts = startSeconds + i * frameDuration;
+      targetTimestamps.push_back(targetPts);
+    }


This logic looks a lot like what we already have in the time-based samplers with the seconds_between_frames parameter:

torchcodec/src/torchcodec/samplers/_time_based.py

Lines 94 to 101 in 50686bb

# The frames of a clip will be sampled at the following pts:

# clip_timestamps = [

# clip_start + 0 * seconds_between_frames,

# clip_start + 1 * seconds_between_frames,

# clip_start + 2 * seconds_between_frames,

# ...

# clip_start + (num_frames_per_clip - 1) * seconds_between_frames,

# ]

I could wrong, but I suspect there is some overlap between the new fps parameter and the existing seconds_between_frames parameter, in terms of functionality. It's been a while since I implemented the latter, so I don't fully remember the details and how exactly it behaves (does it also drop and duplicate frames in the same way? I'm not sure).

Generally, a good design principle is that There should be one-- and preferably only one --obvious way to do it.

I think it might be worth resolving this ambiguity - @mollyxu, can I ask you to do a bit of investigation and figure out the main behavior differences between fps and seconds_between_frames? Once we know how they differ in terms of behavior, it'll be easier to figure out a correct, unambiguous and holistic API for fps output.

scotts · 2026-01-06T14:28:09Z

src/torchcodec/_core/SingleStreamDecoder.cpp

+    // Calculate the exact number of output frames for the half-open range
+    // [startSeconds, stopSeconds). FFmpeg's fps filter drops the frames until
+    // the next frame would overshoot.
+    // https://github.com/FFmpeg/FFmpeg/blob/n7.1/libavfilter/vf_fps.c#L290-L291


We should make this https://github.com/FFmpeg/FFmpeg/blob/b08d7969c550a804a59511c7b83f2dd8cc0499b8/libavfilter/vf_fps.c#L290-L291 instead. That is, we want a permalink which includes the commit hash in the URL. That will never change, even as the code is changed on main. As the code on main changes, lines 290 and 291 will eventually be different code.

(On GitHub, to get a permalink, click the three dots next to a line number, and it brings up a dialog box you can use to copy the permalink.)

add fps

9b68469

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 30, 2025

mollyxu commented Dec 30, 2025

View reviewed changes

Molly Xu added 3 commits December 30, 2025 14:20

add test

087647f

rename to fps

3d5c8ab

rename to fps

b37f236

mollyxu marked this pull request as ready for review January 5, 2026 17:33

scotts reviewed Jan 5, 2026

View reviewed changes

address feedback

204e890

NicolasHug reviewed Jan 6, 2026

View reviewed changes

NicolasHug mentioned this pull request Jan 6, 2026

FPS: enforcing an output frame rate #1133

Open

NicolasHug reviewed Jan 6, 2026

View reviewed changes

scotts reviewed Jan 6, 2026

View reviewed changes

	# The frames of a clip will be sampled at the following pts:
	# clip_timestamps = [
	# clip_start + 0 * seconds_between_frames,
	# clip_start + 1 * seconds_between_frames,
	# clip_start + 2 * seconds_between_frames,
	# ...
	# clip_start + (num_frames_per_clip - 1) * seconds_between_frames,
	# ]

Add FPS to get_frames_played_in_range() #1148

Are you sure you want to change the base?

Add FPS to get_frames_played_in_range() #1148

Conversation

mollyxu commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!