Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
2e25279
Add torchcodec mock with wav loading and saving
samanklesaria Jul 18, 2025
fe375f4
Merge branch 'main' into test_wav_hack
NicolasHug Jul 28, 2025
a300221
Let load and save rely on *_with_torchcodec
NicolasHug Jul 16, 2025
07e3b77
install torchcodec in doc job
NicolasHug Jul 16, 2025
92719d3
Add docstring and arguments for load and save
samanklesaria Aug 12, 2025
4a98ee5
Revise docstring
samanklesaria Aug 13, 2025
7b02754
Add typing imports
samanklesaria Aug 13, 2025
74edc0a
Try ffmpeg>4
samanklesaria Aug 13, 2025
80f5eb7
Install conda deps before pip deps
samanklesaria Aug 13, 2025
7f063a6
Add scipy hack for load and save
samanklesaria Aug 13, 2025
700c6c9
Only import scipy during testing
samanklesaria Aug 13, 2025
6995b21
Revert "Install conda deps before pip deps"
samanklesaria Aug 13, 2025
4ab5993
Revert "Try ffmpeg>4"
samanklesaria Aug 13, 2025
43c4602
Revert torchcodec installation changes
samanklesaria Aug 13, 2025
f74f004
Use existing wav_utils
samanklesaria Aug 13, 2025
953fc65
Support frame_offset and num_frames in load hack
samanklesaria Aug 13, 2025
dd3ff90
Use rand instead of randn for test_save_channels_first
samanklesaria Aug 14, 2025
72539b9
Merge branch 'test_wav_hack' into torchcodec_loading
samanklesaria Aug 14, 2025
c94e011
Remove pytest-aware code in src
samanklesaria Aug 14, 2025
b622d82
Remove torchcodec version check
samanklesaria Aug 14, 2025
93351a2
Fix bugs in torchcodec mock
samanklesaria Aug 14, 2025
5407163
Skip test_load_save_torchcodec
samanklesaria Aug 14, 2025
bd7eb52
Correct call to pytest skip
samanklesaria Aug 14, 2025
c3d0cc2
Remove torchcodec installation
samanklesaria Aug 14, 2025
d10fc19
Add torchcodec to build installation
samanklesaria Aug 15, 2025
92fee51
Remove redundant wav_utils
samanklesaria Aug 15, 2025
cc37073
Merge branch 'main' of github.com:pytorch/audio into torchcodec_loading
NicolasHug Aug 18, 2025
2646e59
remove sys
NicolasHug Aug 18, 2025
6c43c04
Add comments
NicolasHug Aug 18, 2025
498ce49
clarify comment
NicolasHug Aug 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build_docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ jobs:

GPU_ARCH_ID=cu126 # This is hard-coded and must be consistent with gpu-arch-version.
PYTORCH_WHEEL_INDEX="https://download.pytorch.org/whl/${CHANNEL}/${GPU_ARCH_ID}"
pip install --progress-bar=off --pre torch --index-url="${PYTORCH_WHEEL_INDEX}"
pip install --progress-bar=off --pre torch torchcodec --index-url="${PYTORCH_WHEEL_INDEX}"

echo "::endgroup::"
echo "::group::Install TorchAudio"
Expand Down
178 changes: 176 additions & 2 deletions src/torchaudio/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@
get_audio_backend as _get_audio_backend,
info as _info,
list_audio_backends as _list_audio_backends,
load,
save,
set_audio_backend as _set_audio_backend,
)
from ._torchcodec import load_with_torchcodec, save_with_torchcodec
Expand Down Expand Up @@ -41,6 +39,182 @@
pass


def load(
uri: Union[BinaryIO, str, os.PathLike],
frame_offset: int = 0,
num_frames: int = -1,
normalize: bool = True,
channels_first: bool = True,
format: Optional[str] = None,
buffer_size: int = 4096,
backend: Optional[str] = None,
) -> Tuple[torch.Tensor, int]:
"""Load audio data from source using TorchCodec's AudioDecoder.

.. note::

This function supports the same API as :func:`~torchaudio.load`, and
relies on TorchCodec's decoding capabilities under the hood. It is
provided for convenience, but we do recommend that you port your code to
natively use ``torchcodec``'s ``AudioDecoder`` class for better
performance:
https://docs.pytorch.org/torchcodec/stable/generated/torchcodec.decoders.AudioDecoder.
In TorchAudio 2.9, :func:`~torchaudio.load` will be relying on
:func:`~torchaudio.load_with_torchcodec`. Note that some parameters of
:func:`~torchaudio.load`, like ``normalize``, ``buffer_size``, and
``backend``, are ignored by :func:`~torchaudio.load_with_torchcodec`.


Args:
uri (path-like object or file-like object):
Source of audio data. The following types are accepted:

* ``path-like``: File path or URL.
* ``file-like``: Object with ``read(size: int) -> bytes`` method.

frame_offset (int, optional):
Number of samples to skip before start reading data.
num_frames (int, optional):
Maximum number of samples to read. ``-1`` reads all the remaining samples,
starting from ``frame_offset``.
normalize (bool, optional):
TorchCodec always returns normalized float32 samples. This parameter
is ignored and a warning is issued if set to False.
Default: ``True``.
channels_first (bool, optional):
When True, the returned Tensor has dimension `[channel, time]`.
Otherwise, the returned Tensor's dimension is `[time, channel]`.
format (str or None, optional):
Format hint for the decoder. May not be supported by all TorchCodec
decoders. (Default: ``None``)
buffer_size (int, optional):
Not used by TorchCodec AudioDecoder. Provided for API compatibility.
backend (str or None, optional):
Not used by TorchCodec AudioDecoder. Provided for API compatibility.

Returns:
(torch.Tensor, int): Resulting Tensor and sample rate.
Always returns float32 tensors. If ``channels_first=True``, shape is
`[channel, time]`, otherwise `[time, channel]`.

Raises:
ImportError: If torchcodec is not available.
ValueError: If unsupported parameters are used.
RuntimeError: If TorchCodec fails to decode the audio.

Note:
- TorchCodec always returns normalized float32 samples, so the ``normalize``
parameter has no effect.
- The ``buffer_size`` and ``backend`` parameters are ignored.
- Not all audio formats supported by torchaudio backends may be supported
by TorchCodec.
"""
return load_with_torchcodec(
uri,
frame_offset=frame_offset,
num_frames=num_frames,
normalize=normalize,
channels_first=channels_first,
format=format,
buffer_size=buffer_size,
backend=backend
)

def save(
uri: Union[str, os.PathLike],
src: torch.Tensor,
sample_rate: int,
channels_first: bool = True,
format: Optional[str] = None,
encoding: Optional[str] = None,
bits_per_sample: Optional[int] = None,
buffer_size: int = 4096,
backend: Optional[str] = None,
compression: Optional[Union[float, int]] = None,
) -> None:
"""Save audio data to file using TorchCodec's AudioEncoder.

.. note::

This function supports the same API as :func:`~torchaudio.save`, and
relies on TorchCodec's encoding capabilities under the hood. It is
provided for convenience, but we do recommend that you port your code to
natively use ``torchcodec``'s ``AudioEncoder`` class for better
performance:
https://docs.pytorch.org/torchcodec/stable/generated/torchcodec.encoders.AudioEncoder.
In TorchAudio 2.9, :func:`~torchaudio.save` will be relying on
:func:`~torchaudio.save_with_torchcodec`. Note that some parameters of
:func:`~torchaudio.save`, like ``format``, ``encoding``,
``bits_per_sample``, ``buffer_size``, and ``backend``, are ignored by
are ignored by :func:`~torchaudio.save_with_torchcodec`.

This function provides a TorchCodec-based alternative to torchaudio.save
with the same API. TorchCodec's AudioEncoder provides efficient encoding
with FFmpeg under the hood.

Args:
uri (path-like object):
Path to save the audio file. The file extension determines the format.

src (torch.Tensor):
Audio data to save. Must be a 1D or 2D tensor with float32 values
in the range [-1, 1]. If 2D, shape should be [channel, time] when
channels_first=True, or [time, channel] when channels_first=False.

sample_rate (int):
Sample rate of the audio data.

channels_first (bool, optional):
Indicates whether the input tensor has channels as the first dimension.
If True, expects [channel, time]. If False, expects [time, channel].
Default: True.

format (str or None, optional):
Audio format hint. Not used by TorchCodec (format is determined by
file extension). A warning is issued if provided.
Default: None.

encoding (str or None, optional):
Audio encoding. Not fully supported by TorchCodec AudioEncoder.
A warning is issued if provided. Default: None.

bits_per_sample (int or None, optional):
Bits per sample. Not directly supported by TorchCodec AudioEncoder.
A warning is issued if provided. Default: None.

buffer_size (int, optional):
Not used by TorchCodec AudioEncoder. Provided for API compatibility.
A warning is issued if not default value. Default: 4096.

backend (str or None, optional):
Not used by TorchCodec AudioEncoder. Provided for API compatibility.
A warning is issued if provided. Default: None.

compression (float, int or None, optional):
Compression level or bit rate. Maps to bit_rate parameter in
TorchCodec AudioEncoder. Default: None.

Raises:
ImportError: If torchcodec is not available.
ValueError: If input parameters are invalid.
RuntimeError: If TorchCodec fails to encode the audio.

Note:
- TorchCodec AudioEncoder expects float32 samples in [-1, 1] range.
- Some parameters (format, encoding, bits_per_sample, buffer_size, backend)
are not used by TorchCodec but are provided for API compatibility.
- The output format is determined by the file extension in the uri.
- TorchCodec uses FFmpeg under the hood for encoding.
"""
return save_with_torchcodec(uri, src, sample_rate,
channels_first=channels_first,
format=format,
encoding=encoding,
bits_per_sample=bits_per_sample,
buffer_size=buffer_size,
backend=backend,
compression=compression)

__all__ = [
"AudioMetaData",
"load",
Expand Down
Loading