Skip to content

Conversation

@Dan-Flores
Copy link
Contributor

@Dan-Flores Dan-Flores commented Sep 30, 2025

This PR updates the VideoEncoder to support encoding for common video container formats, nearly identically to the FFmpeg CLI.

Changes:

Some changes are made to align with the design in #907:

Testing

  • test_video_encoder_round_trip: Ensures that a video's decoded frames are the same after encoding then decoding.
    • mov, mp4, mkv, webm
  • test_video_encoder_against_ffmpeg_cli: Ensures that the VideoEncoder frames are the same as the FFmpeg CLI.
    • mov, mp4, avi, mkv, webm, flv, gif

Testing caveats

  • The crf parameter is needed to test lossless encoding in the round trip test. For formats that do not support crf, the round trip test is not availablle.
  • When lossy encoding occurs due to codec + pixel format selection, assert_close is substituted by assert_tensor_close_on_at_least with a lower percentage match (96-99), and a higher atol (2-15).

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 30, 2025
@Dan-Flores Dan-Flores force-pushed the encoder_accuracy branch 3 times, most recently from f414d0b to c29dee3 Compare October 6, 2025 15:39
@Dan-Flores Dan-Flores marked this pull request as ready for review October 7, 2025 14:51
@meta-codesync
Copy link

meta-codesync bot commented Oct 10, 2025

@Dan-Flores has imported this pull request. If you are a Meta employee, you can view this in D84393092.

Copy link
Contributor

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for the PR @Dan-Flores ! Approving with some minor comments below, nothing really blocking as long as CI is green

test/test_ops.py Outdated
)

def decode(self, file_path) -> torch.Tensor:
def decode(self, file_path, device="cpu") -> torch.Tensor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "cpu" is always the default, isn't it? We should be able to revert 163c5c2 ?

test/test_ops.py Outdated
# If FFmpeg selects a codec or pixel format that uses qscale (not crf),
# the VideoEncoder outputs *slightly* different frames.
# There may be additional subtle differences in the encoder.
percentage = 94 if ffmpeg_version == 6 or format in ("avi") else 99
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above works but technically,

format in ("avi")

returns True if format is "avi" but also "a", "v", "i", "av", and "vi".

Suggested change
percentage = 94 if ffmpeg_version == 6 or format in ("avi") else 99
percentage = 94 if ffmpeg_version == 6 or format == "avi" else 99

0);
}
int status = avcodec_open2(avCodecContext_.get(), avCodec, &options);
av_dict_free(&options);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is OK, if we start using AVDictionary in more places we should consider creating a UniqueAVDictionary smart pointer, like we did for the other objects e.g.

using UniqueAVFrame =
std::unique_ptr<AVFrame, Deleterp<AVFrame, void, av_frame_free>>;

@Dan-Flores Dan-Flores merged commit 61202b9 into meta-pytorch:main Oct 13, 2025
57 checks passed
@Dan-Flores Dan-Flores deleted the encoder_accuracy branch October 13, 2025 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants