Skip to content

Latest commit

 

History

History
18 lines (14 loc) · 614 Bytes

File metadata and controls

18 lines (14 loc) · 614 Bytes

Formatting and deformatting

Uses rules to split bits of text into graphems more similar to their pronunciation. Example: 1993 -> mille - neuf - cent - quatre - vingt¬ - treize All the other models and tools operate on deformated text.

The rules, their applying orders, and test cases are versionned in yaml files Hosted on a Github Repo

from asr_toolkit.format import FormatHandler as FH
format_handler = FH('assets_dir')

deformatted = format_handler.deformat('1,23 mm³')
# -> "1 - virgule - vingt - trois millimètres cubes"
reformatted = format_handler.format(deformatted)
# -> "1,23 mm³"