Generalize `DeviceInterface` to include audio decoding

Currently, our `DeviceInterface` is implicitly a video device interface. Inside `SingleStreamDecoder`, we dispatch differently based on audio and video: https://github.com/meta-pytorch/torchcodec/blob/6d72f115c4fbbcb8ed62cac43f5f1cfe2bd5f692/src/torchcodec/_core/SingleStreamDecoder.cpp#L1322-L1327
Ideally, we'd like to turn that line into just:
```c++
deviceInterface_->convertAVFrameToFrameOutput(
        avFrame, frameOutput, preAllocatedOutputTensor);
```
That is, we handle audio and video the same. In order to do that, we need to somehow get [`SingleStreamDecoder::convertAudioAVFrameToFrameOutputOnCPU`](https://github.com/meta-pytorch/torchcodec/blob/6d72f115c4fbbcb8ed62cac43f5f1cfe2bd5f692/src/torchcodec/_core/SingleStreamDecoder.cpp#L1331) into a device interface. Design questions:

1. Should we extend `CpuDeviceInterface` to handle audio, in which case we'd dispatch inside of it?
2. Or should we keep `CpuDeviceInterface` to be just video only, and create a new kind of device interface that is just audio?

I think 1 might be the better option, but I'm not sure. I'm confident we'll only ever do audio decoding on the CPU.

	if (streamInfo.avMediaType == AVMEDIA_TYPE_AUDIO) {
	convertAudioAVFrameToFrameOutputOnCPU(avFrame, frameOutput);
	} else {
	deviceInterface_->convertAVFrameToFrameOutput(
	avFrame, frameOutput, preAllocatedOutputTensor);
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generalize `DeviceInterface` to include audio decoding #926

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Generalize DeviceInterface to include audio decoding #926

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Generalize `DeviceInterface` to include audio decoding #926