-
Notifications
You must be signed in to change notification settings - Fork 64
Open
Labels
refactorImproves code itself, but does not fix a bug or add new functionality.Improves code itself, but does not fix a bug or add new functionality.
Description
Currently, our DeviceInterface
is implicitly a video device interface. Inside SingleStreamDecoder
, we dispatch differently based on audio and video:
torchcodec/src/torchcodec/_core/SingleStreamDecoder.cpp
Lines 1322 to 1327 in 6d72f11
if (streamInfo.avMediaType == AVMEDIA_TYPE_AUDIO) { | |
convertAudioAVFrameToFrameOutputOnCPU(avFrame, frameOutput); | |
} else { | |
deviceInterface_->convertAVFrameToFrameOutput( | |
avFrame, frameOutput, preAllocatedOutputTensor); | |
} |
Ideally, we'd like to turn that line into just:
deviceInterface_->convertAVFrameToFrameOutput(
avFrame, frameOutput, preAllocatedOutputTensor);
That is, we handle audio and video the same. In order to do that, we need to somehow get SingleStreamDecoder::convertAudioAVFrameToFrameOutputOnCPU
into a device interface. Design questions:
- Should we extend
CpuDeviceInterface
to handle audio, in which case we'd dispatch inside of it? - Or should we keep
CpuDeviceInterface
to be just video only, and create a new kind of device interface that is just audio?
I think 1 might be the better option, but I'm not sure. I'm confident we'll only ever do audio decoding on the CPU.
Metadata
Metadata
Assignees
Labels
refactorImproves code itself, but does not fix a bug or add new functionality.Improves code itself, but does not fix a bug or add new functionality.