Commit bfa8eef
Add Vietnamese measure text normalization support (#307)
* Add Vietnamese measure text normalization support
- Added measure tagger and verbalizer for Vietnamese TN
- Updated money tagger and verbalizer to handle per-unit measurements
- Added test cases for measure normalization
- Updated fraction handling for better integration
- Added data files for measurements, prefixes, and per-unit bases
Signed-off-by: folivoramanh <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Signed-off-by: folivoramanh <[email protected]>
* add test case for range measure
Signed-off-by: folivoramanh <[email protected]>
* additional support for cardinal and remove duplicate test case
Signed-off-by: folivoramanh <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refractor cardinal and add test cases
Signed-off-by: folivoramanh <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove duplicate lines in run_eval file
Signed-off-by: folivoramanh <[email protected]>
* refractor minor code
Signed-off-by: folivoramanh <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add measure support for unit per unit cases
Signed-off-by: folivoramanh <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: folivoramanh <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>1 parent af2082d commit bfa8eef
File tree
31 files changed
+756
-163
lines changed- nemo_text_processing/text_normalization
- vi
- data
- measure
- money
- numbers
- taggers
- verbalizers
- tests/nemo_text_processing/vi
- data_text_normalization
- tools/text_processing_deployment
31 files changed
+756
-163
lines changedLines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
143 | | - | |
144 | | - | |
145 | | - | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
146 | 146 | | |
147 | 147 | | |
148 | 148 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
176 | 176 | | |
177 | 177 | | |
178 | 178 | | |
| 179 | + | |
179 | 180 | | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
180 | 184 | | |
181 | 185 | | |
182 | 186 | | |
| |||
377 | 381 | | |
378 | 382 | | |
379 | 383 | | |
380 | | - | |
| 384 | + | |
381 | 385 | | |
382 | 386 | | |
383 | 387 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
| 38 | + | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| |||
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
107 | | - | |
108 | | - | |
109 | 107 | | |
110 | 108 | | |
111 | 109 | | |
| |||
Lines changed: 13 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
Lines changed: 20 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
Lines changed: 25 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
Lines changed: 17 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
Lines changed: 2 additions & 19 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
3 | 2 | | |
4 | 3 | | |
5 | 4 | | |
| |||
13 | 12 | | |
14 | 13 | | |
15 | 14 | | |
16 | | - | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | 15 | | |
29 | 16 | | |
30 | 17 | | |
31 | 18 | | |
32 | 19 | | |
33 | 20 | | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | 21 | | |
40 | 22 | | |
41 | 23 | | |
42 | 24 | | |
43 | 25 | | |
44 | 26 | | |
45 | | - | |
| 27 | + | |
| 28 | + | |
Lines changed: 6 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
0 commit comments