@@ -87,15 +87,15 @@ class CpuDeviceInterface : public DeviceInterface {
8787 UniqueSwsContext swsContext_;
8888 SwsFrameContext prevSwsFrameContext_;
8989
90- // We pass the filters to FFmpeg's filtergraph API. It is a simple pipeline
90+ // We pass these filters to FFmpeg's filtergraph API. It is a simple pipeline
9191 // of what FFmpeg calls "filters" to apply to decoded frames before returning
9292 // them. In the PyTorch ecosystem, we call these "transforms". During
9393 // initialization, we convert the user-supplied transforms into this string of
9494 // filters.
9595 //
96- // Note that we start with the format conversion, and then we ensure that the
97- // user-supplied filters always happen BEFORE the format conversion. We want
98- // the user-supplied filters to operate on frames in their original pixel
96+ // Note that we start with just the format conversion, and then we ensure that
97+ // the user-supplied filters always happen BEFORE the format conversion. We
98+ // want the user-supplied filters to operate on frames in their original pixel
9999 // format and colorspace.
100100 //
101101 // The reason why is not obvious: when users do not need to perform any
@@ -111,6 +111,14 @@ class CpuDeviceInterface : public DeviceInterface {
111111 // we could achieve that by calling sws_scale() twice: once to do the resize
112112 // and another time to do the format conversion. But that will be slower,
113113 // which goes against the whole point of calling sws_scale() directly.
114+ //
115+ // Further note that we also configure the sink node of the filtergraph to
116+ // be AV_PIX_FMT_RGB24. However, the explicit format conversion in the
117+ // filters is not redundant. Filtergraph will automatically insert scale
118+ // filters that will change the resolution and format of frames to meet the
119+ // requirements of downstream filters. If we don't put an explicit format
120+ // conversion to rgb24 at the end, filtergraph may automatically insert format
121+ // conversions before our filters.
114122 std::string filters_ = " format=rgb24" ;
115123
116124 // The flags we supply to swsContext_, if it used. The flags control the
0 commit comments