You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/tools/nemo_forced_aligner.rst
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,9 +64,9 @@ Optional parameters:
64
64
65
65
* ``use_local_attention``: boolean flag specifying whether to try to use local attention for the ASR Model (will only work if the ASR Model is a Conformer model). If local attention is used, we will set the local attention context size to [64,64].
66
66
67
-
* ``additional_segment_grouping_separator``: an optional string used to separate the text into smaller segments. If this is not specified, then the whole text will be treated as a single segment. (Default: ``None``. Cannot be empty string or space (" "), as NFA will automatically produce word-level timestamps for substrings separated by spaces).
67
+
* ``additional_segment_grouping_separator``: a list of strings used to separate the text into smaller segments. If set to ``None``, then the whole text will be treated as a single segment. (Default: ``['.', '?', '!', '...']``. Cannot be empty string or space (" "), as NFA will automatically produce word-level timestamps for substrings separated by spaces).
68
68
69
-
.. note:: the ``additional_segment_grouping_separator`` will be removed from the reference text and all the output files, ie it is treated as a marker which is not part of the reference text. The separator will essentially be treated as a space, and any additional spaces around it will be amalgamated into one, i.e. if ``additional_segment_grouping_separator="|"``, the following texts will be treated equivalently: ``“abc|def”``, ``“abc |def”``, ``“abc| def”``, ``“abc | def"``.
69
+
.. note:: Starting in NeMo 2.5.0, separators are preserved in segment text after splitting. if ``additional_segment_grouping_separator="['.', '?', '!', '...']"`` (as is the default), then the text ``"Hi, have you updated your NeMo? Yes. Sure!"`` will result in the following segments ``["Hi, have you updated your NeMo?", "Yes.", "Sure!"]``.
70
70
71
71
* ``remove_blank_tokens_from_ctm``: a boolean denoting whether to remove <blank> tokens from token-level output CTMs. (Default: False).
This method is used to restore the external timestamp ASR model that will be used for forced alignment in `.transcribe()`.
1237
+
The config and weights are expected to be in the main .nemo file and be named `timestamps_asr_model_config.yaml` and `timestamps_asr_model_weights.ckpt` respectively.
0 commit comments