Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,6 @@ processing library. The benefits of PyTorch can be seen in torchaudio through
having all the computations be through PyTorch operations which makes it easy
to use and feel like a natural extension.

- [Support audio I/O (Load files, Save files)](http://pytorch.org/audio/main/)
- Load a variety of audio formats, such as `wav`, `mp3`, `ogg`, `flac`, `opus`, `sphere`, into a torch Tensor using SoX
- [Kaldi (ark/scp)](http://pytorch.org/audio/main/kaldi_io.html)
- [Dataloaders for common audio datasets](http://pytorch.org/audio/main/datasets.html)
- Audio and speech processing functions
- [forced_align](https://pytorch.org/audio/main/generated/torchaudio.functional.forced_align.html)
Expand Down Expand Up @@ -70,7 +67,7 @@ If you find this package useful, please cite as:

```bibtex
@misc{hwang2023torchaudio,
title={TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch},
title={TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch},
author={Jeff Hwang and Moto Hira and Caroline Chen and Xiaohui Zhang and Zhaoheng Ni and Guangzhi Sun and Pingchuan Ma and Ruizhe Huang and Vineel Pratap and Yuekai Zhang and Anurag Kumar and Chin-Yun Yu and Chuang Zhu and Chunxi Liu and Jacob Kahn and Mirco Ravanelli and Peng Sun and Shinji Watanabe and Yangyang Shi and Yumeng Tao and Robin Scheibler and Samuele Cornell and Sean Kim and Stavros Petridis},
year={2023},
eprint={2310.17864},
Expand Down
20 changes: 0 additions & 20 deletions docs/source/compliance.kaldi.rst

This file was deleted.

7 changes: 3 additions & 4 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,16 @@ Torchaudio Documentation
.. image:: _static/img/logo.png

Torchaudio is a library for audio and signal processing with PyTorch.
It provides I/O, signal and data processing functions, datasets,
It provides signal and data processing functions, datasets,
model implementations and application components.

.. note::
Starting with version 2.8, we are refactoring TorchAudio to transition it
into a maintenance phase. As a result:

- Some APIs are deprecated in 2.8 and will be removed in 2.9.
- Some APIs were deprecated in 2.8 and removed as of 2.9.
- The decoding and encoding capabilities of PyTorch for both audio and video
are being consolidated into TorchCodec.
have been consolidated into TorchCodec.

Please see https://github.com/pytorch/audio/issues/3902 for more information.

Expand Down Expand Up @@ -67,7 +67,6 @@ model implementations and application components.
models
models.decoder
pipelines
utils

.. toctree::
:maxdepth: 1
Expand Down
81 changes: 3 additions & 78 deletions docs/source/torchaudio.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,11 @@ torchaudio
.. currentmodule:: torchaudio

.. warning::
Starting with version 2.8, we are refactoring TorchAudio to transition it
into a maintenance phase. As a result:
Starting with version 2.9, we have transitioned TorchAudio into a maintenance phase. As a result:

- Most APIs listed below are deprecated in 2.8 and will be removed in 2.9.
- APIs deprecated in version 2.8 have been removed in 2.9.
- The decoding and encoding capabilities of PyTorch for both audio and video
are being consolidated into TorchCodec. For convenience, we provide
have been consolidated into TorchCodec. For convenience, we provide
:func:`~torchaudio.load_with_torchcodec` as a replacement for
:func:`~torchaudio.load` and :func:`~torchaudio.save_with_torchcodec` as a
replacement for :func:`~torchaudio.save`, but we recommend that you port
Expand All @@ -28,81 +27,7 @@ it easy to handle audio data.
:nosignatures:
:template: autosummary/io.rst

info
load
load_with_torchcodec
save
save_with_torchcodec

.. _backend:

Backend and Dispatcher
----------------------

Decoding and encoding media is highly elaborated process. Therefore, TorchAudio
relies on third party libraries to perform these operations. These third party
libraries are called ``backend``, and currently TorchAudio integrates the
following libraries.

Please refer to `Installation <./installation.html>`__ for how to enable backends.

Conventionally, TorchAudio has had its I/O backend set globally at runtime
based on availability. However, this approach does not allow applications to
use different backends, and it is not well-suited for large codebases.

For these reasons, in v2.0, we introduced a dispatcher, a new mechanism to allow
users to choose a backend for each function call.

When dispatcher mode is enabled, all the I/O functions accept extra keyward argument
``backend``, which specifies the desired backend. If the specified
backend is not available, the function call will fail.

If a backend is not explicitly chosen, the functions will select a backend to use given order of precedence and library availability.

The following table summarizes the backends.

.. list-table::
:header-rows: 1
:widths: 8 12 25 60

* - Priority
- Backend
- Supported OS
- Note
* - 1
- FFmpeg
- Linux, macOS, Windows
- Note

This backend Supports various protocols, such as HTTPS and MP4, and file-like objects.
* - 3
- SoundFile
- Linux, macOS, Windows
- Please refer to `the official document <https://pysoundfile.readthedocs.io/>`__ for the supported codecs.

This backend supports file-like objects.

.. _dispatcher_migration:

Dispatcher Migration
~~~~~~~~~~~~~~~~~~~~

We are migrating the I/O functions to use the dispatcher mechanism, and this
incurs multiple changes, some of which involve backward-compatibility-breaking
changes, and require users to change their function call.

The (planned) changes are as follows. For up-to-date information,
please refer to https://github.com/pytorch/audio/issues/2950

* In 2.0, audio I/O backend dispatcher was introduced.
Users can opt-in to using dispatcher by setting the environment variable
``TORCHAUDIO_USE_BACKEND_DISPATCHER=1``.
* In 2.1, the disptcher became the default mechanism for I/O.
* In 2.2, the legacy global backend mechanism is removed.
Utility functions :py:func:`get_audio_backend` and :py:func:`set_audio_backend`
became no-op.

Furthermore, we removed file-like object support from libsox backend, as this
is better supported by FFmpeg backend and makes the build process simpler.
Therefore, beginning with 2.1, FFmpeg and Soundfile are the sole backends that support
file-like objects.
23 changes: 0 additions & 23 deletions docs/source/utils.rst

This file was deleted.

2 changes: 0 additions & 2 deletions packaging/torchaudio/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,7 @@ build:
test:
imports:
- torchaudio
- torchaudio.io
- torchaudio.datasets
- torchaudio.sox_effects
- torchaudio.transforms

source_files:
Expand Down
9 changes: 4 additions & 5 deletions test/torchaudio_unittest/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,8 @@ Some useful pytest commands:
pytest test --collect-only
# Run all the test suites
pytest test
# Run tests on sox_effects module
pytest test/torchaudio_unittest/sox_effect
# use -k to apply filter
pytest test/torchaudio_unittest/sox_io_backend -k load # only runs tests where their names contain load
pytest test/torchaudio_unittest/test_load_save_torchcodec.py -k load # only runs tests where their names contain load
# Some other useful options;
# Stop on the first failure -x
# Run failure fast --ff
Expand Down Expand Up @@ -61,8 +59,6 @@ The following test modules are defined for corresponding `torchaudio` module/fun
- [`torchaudio.functional`](./functional)
- [`torchaudio.transforms`](./transforms/transforms_test.py)
- [`torchaudio.compliance.kaldi`](./compliance_kaldi_test.py)
- [`torchaudio.kaldi_io`](./kaldi_io_test.py)
- [`torchaudio.sox_effects`](./sox_effect)
- [`torchaudio.backend`](./backend)

### Test modules that do not fall into the above categories
Expand All @@ -73,6 +69,9 @@ The following test modules are defined for corresponding `torchaudio` module/fun
- [assets](./assets): Contain sample audio files.
- [assets/kaldi](./assets/kaldi): Contains Kaldi format matrix files used in [./test_compliance_kaldi.py](./test_compliance_kaldi.py).
- [compliance](./compliance): Scripts used to generate above Kaldi matrix files.
- [assets/kaldi_expected_results](./assets/kaldi_expected_results): Contains outputs from Kaldi to compare against torchaudio functionality in [./compliance/kaldi](./compliance/kaldi).
- [assets/librosa_expected_results](./assets/librosa_expected_results): Contains outputs from Librosa to compare against torchaudio functionality.
- [assets/sox_expected_results](./assets/sox_expected_results): Contains outputs from Sox to compare against torchaudio functionality.

### Waveforms for Testing Purposes

Expand Down
88 changes: 0 additions & 88 deletions test/torchaudio_unittest/assets/sox_effect_test_args.jsonl

This file was deleted.