Skip to content

Commit c810d53

Browse files
author
Beat Buesser
committed
Update docs
Signed-off-by: Beat Buesser <[email protected]>
1 parent fd03959 commit c810d53

File tree

3 files changed

+28
-10
lines changed

3 files changed

+28
-10
lines changed

art/estimators/speech_recognition/pytorch_deep_speech.py

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -277,11 +277,11 @@ def predict(
277277
:type transcription_output: `bool`
278278
:return: Probability (if transcription_output is None or False) or transcription (if transcription_output is
279279
True) predictions:
280-
- Probability return is a tuple of (probs, sizes), where:
281-
- probs is the probability of characters of shape (nb_samples, seq_length, nb_classes).
282-
- sizes is the real sequence length of shape (nb_samples,).
283-
- Transcription return is a numpy array of characters. A possible example of a transcription return
284-
is `np.array(['SIXTY ONE', 'HELLO'])`.
280+
- Probability return is a tuple of (probs, sizes), where `probs` is the probability of characters of
281+
shape (nb_samples, seq_length, nb_classes) and `sizes` is the real sequence length of shape
282+
(nb_samples,).
283+
- Transcription return is a numpy array of characters. A possible example of a transcription return
284+
is `np.array(['SIXTY ONE', 'HELLO'])`.
285285
"""
286286
import torch # lgtm [py/repeated-import]
287287

@@ -529,11 +529,11 @@ def transform_model_input(
529529
:param real_lengths: Real lengths of original sequences.
530530
:return: A tuple of inputs and targets in the model space with the original index
531531
`(inputs, targets, input_percentages, target_sizes, batch_idx)`, where:
532-
- inputs: model inputs of shape (nb_samples, nb_frequencies, seq_length).
533-
- targets: ground truth targets of shape (sum over nb_samples of real seq_lengths).
534-
- input_percentages: percentages of real inputs in inputs.
535-
- target_sizes: list of real seq_lengths.
536-
- batch_idx: original index of inputs.
532+
- inputs: model inputs of shape (nb_samples, nb_frequencies, seq_length).
533+
- targets: ground truth targets of shape (sum over nb_samples of real seq_lengths).
534+
- input_percentages: percentages of real inputs in inputs.
535+
- target_sizes: list of real seq_lengths.
536+
- batch_idx: original index of inputs.
537537
"""
538538
import torch # lgtm [py/repeated-import]
539539
import torchaudio

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ Supported Machine Learning Libraries
8080
modules/estimators/generation
8181
modules/estimators/object_detection
8282
modules/estimators/regression
83+
modules/estimators/speech_recognition
8384
modules/metrics
8485
modules/wrappers
8586
modules/data_generators
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
:mod:`art.estimators.speech_recognition`
2+
========================================
3+
.. automodule:: art.estimators.speech_recognition
4+
5+
Mixin Base Class Speech Recognizer
6+
----------------------------------
7+
.. autoclass:: SpeechRecognizerMixin
8+
:members:
9+
:special-members: __init__
10+
:inherited-members:
11+
12+
Speech Recognizer Deep Speech
13+
-----------------------------
14+
.. autoclass:: PyTorchDeepSpeech
15+
:members:
16+
:special-members: __init__
17+
:inherited-members:

0 commit comments

Comments
 (0)