Skip to content

Conversation

@mollyxu
Copy link
Contributor

@mollyxu mollyxu commented Dec 30, 2025

Add FPS to get_frames_played_in_range()

Context: #1133

This PR adds support for specifying a target frame rate when calling get_frames_played_in_range(). When target_fps is provided, the method resamples the video to output frames at evenly-spaced timestamps matching the target FPS, mimicking FFmpeg fps filter behavior.

def get_frames_played_in_range(
    self,
    start_seconds: float,
    stop_seconds: float,
    fps: float | None = None,
) -> FrameBatch:

The way to read this API if fps is:

  • None, default: returns decoded frames between start_seconds and stop_seconds at the input frame rate. This is the existing behavior.
  • float: returns decoded frames between start_seconds and stop_seconds where the output frame rate matches the value of fps.

Implementation

The resampling algorithm replicates FFmpeg's fps filter logic:

  1. Frame count calculation: Uses round((stop - start) * target_fps) to determine output frame count, matching FFmpeg's AV_ROUND_NEAR_INF rounding behavior.
  2. Frame mapping: For each source frame, computes output_index = round((pts - start) * target_fps). Multiple source frames mapping to the same output index results in the last one being used.
  3. Gap filling: When upsampling (target FPS > source FPS), empty output slots use the previous valid frame (the frame that would still be "displayed" at that timestamp).
  4. Decode optimization: Consecutive output frames using the same source frame are handled by copying the already-decoded tensor, avoiding redundant decodes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 30, 2025
std::to_string(maxSeconds.value()) + ").");
}

// Resample frames to match the target frame rate
Copy link
Contributor Author

@mollyxu mollyxu Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add an early break if requested fps is the same as the current fps. Not sure if it is a necessary use case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's handle that with a follow-up. Because we'll need to do equality on a double, there may be some subtle things to handle.

@mollyxu mollyxu marked this pull request as ready for review January 5, 2026 17:33
double startSeconds,
double stopSeconds) {
double stopSeconds,
std::optional<double> fps) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight preference for SingleStreamDecoder to not take optionals, and for the bridge code in custom_ops.cpp to take care of optional logic. But, there are other APIs in SingleStreamDecoder which do take optionals, I'll invite @Dan-Flores and @NicolasHug to make a call here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems natural to me to have an optional here. Letting custom_ops.cpp be the bridge, I think, would mean that we'd have to create more private method to split the implementation (for e.g. the common checks at the beginning, and the common logic at the end).

fps.value() > 0,
"fps must be positive, got " + std::to_string(fps.value()));

double fpsVal = static_cast<double>(fps.value());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no need to do a cast, as fps.value() returns a double.

// determine the frame count, which we replicate here for consistency.
// Examples:
// - 35.23 fps, 1.0s: round(35.23) = 35 frames
// - 35.87 fps, 1.0s: round(35.87) = 36 frames
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put a link to the resource you learned the exact techniques that the fps filter is using? I assume it's the source code, in which case a stable link to that code would be great. We want to make it easy for others to make changes to this code, and they may need to look at that as well.

videoStreamOptions.device);
frameBatchOutput.data = maybePermuteHWC2CHW(frameBatchOutput.data);
return frameBatchOutput;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ever valid for numOutputFrames < 0? That is, we ensure that startSeconds <= stopSeconds at the very top, and we also ensure that fpsVal > 0. So I think that if numOutputFrames < 0, that is an error.

And in the case that numOutputFrames == 0, I don't think we actually need to special case it. I think the remaining logic should be robust to that case.

if (outputIdx >= 0) {
sourceFrameIndices[outputIdx] = j;
}
}
Copy link
Contributor

@scotts scotts Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mapping logic will need to be seek mode aware. In approximate mode, streamInfo.allFrames won't be available and we'll need to do math to calculate the value. Check out secondsToIndexLowerBound() for inspiration.

We'll also need to test exact and approximate seek modes. :)

with pytest.raises(ValueError, match="Invalid stop seconds"):
frame = decoder.get_frames_played_in_range(0, 23) # noqa

@pytest.mark.parametrize("device", all_supported_devices())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also parameterize on seek mode - I think we can skip custom and just do "exact" and "approximate". Based on the code above, I think "approximate" will currently fail.

with pytest.raises(RuntimeError, match="fps must be positive"):
decoder.get_frames_played_in_range(start_seconds, stop_seconds, fps=-10)

@needs_ffmpeg_cli
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also parametrize this test on seek mode.

)

source_fps = decoder.metadata.average_fps
duration_seconds = 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need some tests that extract a longer duration. In particular, we'll want to make sure we have tests that decode the entire video. I suspect we'll hit some edge-cases in our dropping and duping logic when we decode all frames.


def get_frames_played_in_range(
self, start_seconds: float, stop_seconds: float
self, start_seconds: float, stop_seconds: float, fps: float | None = None
Copy link
Contributor

@NicolasHug NicolasHug Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we expose the fps functionality in this method, then for convenience I think we should make sure to also expose a get_all_frames(fps) method, so that users don't have to manually specify both start and stop. (Exposing this get_all_frames() method is something we should do regardless).

Comment on lines +935 to +938
for (int64_t i = 0; i < numOutputFrames; ++i) {
double targetPts = startSeconds + i * frameDuration;
targetTimestamps.push_back(targetPts);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic looks a lot like what we already have in the time-based samplers with the seconds_between_frames parameter:

# The frames of a clip will be sampled at the following pts:
# clip_timestamps = [
# clip_start + 0 * seconds_between_frames,
# clip_start + 1 * seconds_between_frames,
# clip_start + 2 * seconds_between_frames,
# ...
# clip_start + (num_frames_per_clip - 1) * seconds_between_frames,
# ]

I could wrong, but I suspect there is some overlap between the new fps parameter and the existing seconds_between_frames parameter, in terms of functionality. It's been a while since I implemented the latter, so I don't fully remember the details and how exactly it behaves (does it also drop and duplicate frames in the same way? I'm not sure).

Generally, a good design principle is that There should be one-- and preferably only one --obvious way to do it.

I think it might be worth resolving this ambiguity - @mollyxu, can I ask you to do a bit of investigation and figure out the main behavior differences between fps and seconds_between_frames? Once we know how they differ in terms of behavior, it'll be easier to figure out a correct, unambiguous and holistic API for fps output.

// Calculate the exact number of output frames for the half-open range
// [startSeconds, stopSeconds). FFmpeg's fps filter drops the frames until
// the next frame would overshoot.
// https://github.com/FFmpeg/FFmpeg/blob/n7.1/libavfilter/vf_fps.c#L290-L291
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this https://github.com/FFmpeg/FFmpeg/blob/b08d7969c550a804a59511c7b83f2dd8cc0499b8/libavfilter/vf_fps.c#L290-L291 instead. That is, we want a permalink which includes the commit hash in the URL. That will never change, even as the code is changed on main. As the code on main changes, lines 290 and 291 will eventually be different code.

(On GitHub, to get a permalink, click the three dots next to a line number, and it brings up a dialog box you can use to copy the permalink.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants