Skip to content

Commit 1c66402

Browse files
authored
Fix issues in documentation build. (#4070)
* Fix documentation build. * Add comments about sphinx mocking conflict with detecting type instances.
1 parent 1eba300 commit 1c66402

File tree

3 files changed

+254
-1
lines changed

3 files changed

+254
-1
lines changed

docs/source/index.rst

Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
Torchaudio Documentation
2+
========================
3+
4+
.. image:: _static/img/logo.png
5+
6+
Torchaudio is a library for audio and signal processing with PyTorch.
7+
It provides I/O, signal and data processing functions, datasets,
8+
model implementations and application components.
9+
10+
.. note::
11+
Starting with version 2.8, we are refactoring TorchAudio to transition it
12+
into a maintenance phase. As a result:
13+
14+
- Some APIs are deprecated in 2.8 and will be removed in 2.9.
15+
- The decoding and encoding capabilities of PyTorch for both audio and video
16+
are being consolidated into TorchCodec.
17+
18+
Please see https://github.com/pytorch/audio/issues/3902 for more information.
19+
20+
21+
..
22+
Generate Table Of Contents (left navigation bar)
23+
NOTE: If you are adding tutorials, add entries to toctree and customcarditem below
24+
25+
.. toctree::
26+
:maxdepth: 1
27+
:caption: Torchaudio Documentation
28+
:hidden:
29+
30+
Index <self>
31+
supported_features
32+
feature_classifications
33+
logo
34+
references
35+
36+
.. toctree::
37+
:maxdepth: 2
38+
:caption: Installation
39+
:hidden:
40+
41+
installation
42+
build
43+
build.linux
44+
build.windows
45+
build.jetson
46+
47+
.. toctree::
48+
:maxdepth: 1
49+
:caption: Training Recipes
50+
:hidden:
51+
52+
Conformer RNN-T ASR <https://github.com/pytorch/audio/tree/main/examples/asr/librispeech_conformer_rnnt>
53+
Emformer RNN-T ASR <https://github.com/pytorch/audio/tree/main/examples/asr/emformer_rnnt>
54+
Conv-TasNet Source Separation <https://github.com/pytorch/audio/tree/main/examples/source_separation>
55+
HuBERT Pre-training and Fine-tuning (ASR) <https://github.com/pytorch/audio/tree/main/examples/hubert>
56+
Real-time AV-ASR <https://github.com/pytorch/audio/tree/main/examples/avsr>
57+
58+
.. toctree::
59+
:maxdepth: 1
60+
:caption: Python API Reference
61+
:hidden:
62+
63+
torchaudio
64+
functional
65+
transforms
66+
datasets
67+
models
68+
models.decoder
69+
pipelines
70+
utils
71+
72+
.. toctree::
73+
:maxdepth: 1
74+
:caption: PyTorch Libraries
75+
:hidden:
76+
77+
PyTorch <https://pytorch.org/docs>
78+
torchaudio <https://pytorch.org/audio>
79+
torchtext <https://pytorch.org/text>
80+
torchvision <https://pytorch.org/vision>
81+
TorchElastic <https://pytorch.org/elastic/>
82+
TorchServe <https://pytorch.org/serve>
83+
PyTorch on XLA Devices <http://pytorch.org/xla/>
84+
85+
Tutorials
86+
---------
87+
88+
.. customcardstart::
89+
90+
.. customcarditem::
91+
:header: AM inference with CUDA CTC Beam Seach Decoder
92+
:card_description: Learn how to perform ASR beam search decoding with GPU, using <code>torchaudio.models.decoder.cuda_ctc_decoder</code>.
93+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/asr_inference_with_ctc_decoder_tutorial.png
94+
:link: tutorials/asr_inference_with_cuda_ctc_decoder_tutorial.html
95+
:tags: Pipelines,ASR,CTC-Decoder,CUDA-CTC-Decoder
96+
97+
.. customcarditem::
98+
:header: CTC Forced Alignment API
99+
:card_description: Learn how to use TorchAudio's CTC forced alignment API (<code>torchaudio.functional.forced_align</code>).
100+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/ctc_forced_alignment_api_tutorial.png
101+
:link: tutorials/ctc_forced_alignment_api_tutorial.html
102+
:tags: CTC,Forced-Alignment
103+
104+
.. customcarditem::
105+
:header: Forced alignment for multilingual data
106+
:card_description: Learn how to use align multiligual data using TorchAudio's CTC forced alignment API (<code>torchaudio.functional.forced_align</code>) and a multiligual Wav2Vec2 model.
107+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/forced_alignment_for_multilingual_data_tutorial.png
108+
:link: tutorials/forced_alignment_for_multilingual_data_tutorial.html
109+
:tags: Forced-Alignment
110+
111+
.. customcarditem::
112+
:header: Audio resampling with bandlimited sinc interpolation
113+
:card_description: Learn how to resample audio tensor with <code>torchaudio.functional.resample</code> and <code>torchaudio.transforms.Resample</code>.
114+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/audio_resampling_tutorial.png
115+
:link: tutorials/audio_resampling_tutorial.html
116+
:tags: Preprocessing
117+
118+
.. customcarditem::
119+
:header: Audio data augmentation
120+
:card_description: Learn how to use <code>torchaudio.functional</code> and <code>torchaudio.transforms</code> modules to perform data augmentation.
121+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/audio_data_augmentation_tutorial.png
122+
:link: tutorials/audio_data_augmentation_tutorial.html
123+
:tags: Preprocessing
124+
125+
.. customcarditem::
126+
:header: Audio feature extraction
127+
:card_description: Learn how to use <code>torchaudio.functional</code> and <code>torchaudio.transforms</code> modules to extract features from waveform.
128+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/audio_feature_extractions_tutorial.png
129+
:link: tutorials/audio_feature_extractions_tutorial.html
130+
:tags: Preprocessing
131+
132+
.. customcarditem::
133+
:header: Audio feature augmentation
134+
:card_description: Learn how to use <code>torchaudio.functional</code> and <code>torchaudio.transforms</code> modules to perform feature augmentation.
135+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/audio_feature_augmentation_tutorial.png
136+
:link: tutorials/audio_feature_augmentation_tutorial.html
137+
:tags: Preprocessing
138+
139+
.. customcarditem::
140+
:header: Audio dataset
141+
:card_description: Learn how to use <code>torchaudio.datasets</code> module.
142+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/audio_datasets_tutorial.png
143+
:link: tutorials/audio_datasets_tutorial.html
144+
:tags: Dataset
145+
146+
.. customcarditem::
147+
:header: AM inference with Wav2Vec2
148+
:card_description: Learn how to perform acoustic model inference with Wav2Vec2 (<code>torchaudio.pipelines.Wav2Vec2ASRBundle</code>).
149+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/speech_recognition_pipeline_tutorial.png
150+
:link: tutorials/speech_recognition_pipeline_tutorial.html
151+
:tags: ASR,wav2vec2
152+
153+
.. customcarditem::
154+
:header: LM inference with CTC Beam Seach Decoder
155+
:card_description: Learn how to perform ASR beam search decoding with lexicon and language model, using <code>torchaudio.models.decoder.ctc_decoder</code>.
156+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/asr_inference_with_ctc_decoder_tutorial.png
157+
:link: tutorials/asr_inference_with_ctc_decoder_tutorial.html
158+
:tags: Pipelines,ASR,wav2vec2,CTC-Decoder
159+
160+
.. customcarditem::
161+
:header: Forced Alignment with Wav2Vec2
162+
:card_description: Learn how to align text to speech with Wav2Vec 2 (<code>torchaudio.pipelines.Wav2Vec2ASRBundle</code>).
163+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/forced_alignment_tutorial.png
164+
:link: tutorials/forced_alignment_tutorial.html
165+
:tags: Pipelines,Forced-Alignment,wav2vec2
166+
167+
.. customcarditem::
168+
:header: Text-to-Speech with Tacotron2
169+
:card_description: Learn how to generate speech from text with Tacotron2 (<code>torchaudio.pipelines.Tacotron2TTSBundle</code>).
170+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/tacotron2_pipeline_tutorial.png
171+
:link: tutorials/tacotron2_pipeline_tutorial.html
172+
:tags: Pipelines,TTS-(Text-to-Speech)
173+
174+
.. customcarditem::
175+
:header: Speech Enhancement with MVDR Beamforming
176+
:card_description: Learn how to improve speech quality with MVDR Beamforming.
177+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/mvdr_tutorial.png
178+
:link: tutorials/mvdr_tutorial.html
179+
:tags: Pipelines,Speech-Enhancement
180+
181+
.. customcarditem::
182+
:header: Music Source Separation with Hybrid Demucs
183+
:card_description: Learn how to perform music source separation with pre-trained Hybrid Demucs (<code>torchaudio.pipelines.SourceSeparationBundle</code>).
184+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/hybrid_demucs_tutorial.png
185+
:link: tutorials/hybrid_demucs_tutorial.html
186+
:tags: Pipelines,Source-Separation
187+
188+
.. customcarditem::
189+
:header: Torchaudio-Squim: Non-intrusive Speech Assessment in TorchAudio
190+
:card_description: Learn how to estimate subjective and objective metrics with pre-trained TorchAudio-SQUIM models (<code>torchaudio.pipelines.SQUIMObjective</code>).
191+
:image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/squim_tutorial.png
192+
:link: tutorials/squim_tutorial.html
193+
:tags: Pipelines,Speech Assessment,Speech Enhancement
194+
.. customcardend::
195+
196+
197+
Citing torchaudio
198+
-----------------
199+
200+
If you find torchaudio useful, please cite the following paper:
201+
202+
- Hwang, J., Hira, M., Chen, C., Zhang, X., Ni, Z., Sun, G., Ma, P., Huang, R., Pratap, V.,
203+
Zhang, Y., Kumar, A., Yu, C.-Y., Zhu, C., Liu, C., Kahn, J., Ravanelli, M., Sun, P.,
204+
Watanabe, S., Shi, Y., Tao, T., Scheibler, R., Cornell, S., Kim, S., & Petridis, S. (2023).
205+
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch. arXiv preprint arXiv:2310.17864
206+
207+
- Yang, Y.-Y., Hira, M., Ni, Z., Chourdia, A., Astafurov, A., Chen, C., Yeh, C.-F., Puhrsch, C.,
208+
Pollack, D., Genzel, D., Greenberg, D., Yang, E. Z., Lian, J., Mahadeokar, J., Hwang, J.,
209+
Chen, J., Goldsborough, P., Roy, P., Narenthiran, S., Watanabe, S., Chintala, S.,
210+
Quenneville-Bélair, V, & Shi, Y. (2021).
211+
TorchAudio: Building Blocks for Audio and Speech Processing. arXiv preprint arXiv:2110.15018.
212+
213+
In BibTeX format:
214+
215+
.. code-block:: bibtex
216+
217+
@misc{hwang2023torchaudio,
218+
title={TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch},
219+
author={Jeff Hwang and Moto Hira and Caroline Chen and Xiaohui Zhang and Zhaoheng Ni and Guangzhi Sun and Pingchuan Ma and Ruizhe Huang and Vineel Pratap and Yuekai Zhang and Anurag Kumar and Chin-Yun Yu and Chuang Zhu and Chunxi Liu and Jacob Kahn and Mirco Ravanelli and Peng Sun and Shinji Watanabe and Yangyang Shi and Yumeng Tao and Robin Scheibler and Samuele Cornell and Sean Kim and Stavros Petridis},
220+
year={2023},
221+
eprint={2310.17864},
222+
archivePrefix={arXiv},
223+
primaryClass={eess.AS}
224+
}
225+
226+
.. code-block:: bibtex
227+
228+
@article{yang2021torchaudio,
229+
title={TorchAudio: Building Blocks for Audio and Speech Processing},
230+
author={Yao-Yuan Yang and Moto Hira and Zhaoheng Ni and
231+
Anjali Chourdia and Artyom Astafurov and Caroline Chen and
232+
Ching-Feng Yeh and Christian Puhrsch and David Pollack and
233+
Dmitriy Genzel and Donny Greenberg and Edward Z. Yang and
234+
Jason Lian and Jay Mahadeokar and Jeff Hwang and Ji Chen and
235+
Peter Goldsborough and Prabhat Roy and Sean Narenthiran and
236+
Shinji Watanabe and Soumith Chintala and
237+
Vincent Quenneville-Bélair and Yangyang Shi},
238+
journal={arXiv preprint arXiv:2110.15018},
239+
year={2021}
240+
}

docs/source/torchaudio.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ The following table summarizes the backends.
7272
* - 1
7373
- FFmpeg
7474
- Linux, macOS, Windows
75+
- Note
7576

7677
This backend Supports various protocols, such as HTTPS and MP4, and file-like objects.
7778
* - 3

src/torchaudio/models/decoder/__init__.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,23 @@ def __getattr__(name: str):
3535
"To use CUCTC decoder, please set BUILD_CUDA_CTC_DECODER=1 when building from source."
3636
) from err
3737

38+
# TODO: when all unsupported classes are removed, replace the
39+
# following if-else block with
40+
# item = getattr(_cuda_ctc_decoder, name)
3841
orig_item = getattr(_cuda_ctc_decoder, name)
39-
if inspect.isclass(orig_item):
42+
if (
43+
inspect.isclass(orig_item)
44+
or (
45+
# workaround a failure to detect type instances
46+
# after sphinx autodoc mocking, required for
47+
# building docs
48+
getattr(orig_item, "__sphinx_mock__", False) and inspect.isclass(orig_item.__class__)
49+
)
50+
):
4051
item = dropping_class_support(orig_item)
4152
else:
4253
item = dropping_support(orig_item)
54+
4355
globals()[name] = item
4456
return item
4557
raise AttributeError(f"module {__name__} has no attribute {name}")

0 commit comments

Comments
 (0)