|
1 | | -**Table of Contents** |
2 | 1 | - [\[English\] LibriSpeech](#english-librispeech) |
3 | 2 | - [I. Small + SentencePiece 1k](#i-small--sentencepiece-1k) |
4 | | - - [Config](#config) |
5 | | - - [Results](#results) |
6 | | -- [\[Vietnamese\] VietBud500](#vietnamese-vietbud500) |
7 | 3 | - [II. Small + Streaming + SentencePiece 1k](#ii-small--streaming--sentencepiece-1k) |
8 | | - - [Config](#config-1) |
9 | | - - [Results](#results-1) |
| 4 | +- [\[Vietnamese\] VietBud500](#vietnamese-vietbud500) |
| 5 | + - [I. Small + Streaming + SentencePiece 1k](#i-small--streaming--sentencepiece-1k) |
10 | 6 |
|
11 | 7 | <!-- ----------------------------------------------------- EN ------------------------------------------------------ --> |
12 | 8 |
|
|
23 | 19 | | Global Batch Size | 4 * 4 * 8 = 128 (as 4 TPUs, 8 Gradient Accumulation Steps) | |
24 | 20 | | Max Epochs | 300 | |
25 | 21 |
|
26 | | -### Config |
| 22 | +**Config:** |
27 | 23 |
|
28 | 24 | ```jinja2 |
29 | 25 | {% import "examples/datasets/librispeech/sentencepiece/sp.yml.j2" as decoder_config with context %} |
|
32 | 28 | {{config}} |
33 | 29 | ``` |
34 | 30 |
|
35 | | -### Results |
| 31 | +**Results:** |
36 | 32 |
|
37 | 33 | | Epoch | Dataset | decoding | wer | cer | mer | wil | wip | |
38 | 34 | | :---- | :--------- | :------- | :------- | :------- | :------- | :------- | :------- | |
39 | 35 | | 157 | test-clean | greedy | 0.062918 | 0.025361 | 0.062527 | 0.109992 | 0.890007 | |
40 | 36 | | 157 | test-other | greedy | 0.142616 | 0.066839 | 0.140610 | 0.239201 | 0.760798 | |
41 | 37 |
|
| 38 | +## II. Small + Streaming + SentencePiece 1k |
| 39 | + |
| 40 | +| Category | Description | |
| 41 | +| :---------------- | :--------------------------------------------------------- | |
| 42 | +| Config | [small-streaming.yml.j2](../../small-streaming.yml.j2) | |
| 43 | +| Tensorflow | **2.18.0** | |
| 44 | +| Device | Google Cloud TPUs v4-8 | |
| 45 | +| Mixed Precision | strict | |
| 46 | +| Global Batch Size | 4 * 4 * 8 = 128 (as 4 TPUs, 8 Gradient Accumulation Steps) | |
| 47 | +| Max Epochs | 300 | |
| 48 | + |
| 49 | +**Config:** |
| 50 | + |
| 51 | +```jinja2 |
| 52 | +{% import "examples/datasets/librispeech/sentencepiece/sp.yml.j2" as decoder_config with context %} |
| 53 | +{{decoder_config}} |
| 54 | +{% import "examples/models/transducer/conformer/small-streaming.yml.j2" as config with context %} |
| 55 | +{{config}} |
| 56 | +``` |
| 57 | + |
| 58 | +**Results:** |
| 59 | + |
| 60 | +| Epoch | Dataset | decoding | wer | cer | mer | wil | wip | |
| 61 | +| :---- | :--------- | :------- | :------- | :-------- | :------- | :------- | :------- | |
| 62 | +| 45 | test-clean | greedy | 0.110564 | 0.0460022 | 0.109064 | 0.186109 | 0.813891 | |
| 63 | +| 45 | test-other | greedy | 0.267772 | 0.139369 | 0.260952 | 0.417361 | 0.582639 | |
| 64 | + |
42 | 65 | <!-- ----------------------------------------------------- VN ------------------------------------------------------ --> |
43 | 66 |
|
44 | 67 | # [Vietnamese] VietBud500 |
45 | 68 |
|
46 | | -## II. Small + Streaming + SentencePiece 1k |
| 69 | +## I. Small + Streaming + SentencePiece 1k |
47 | 70 |
|
48 | 71 | | Category | Description | |
49 | 72 | | :---------------- | :--------------------------------------------------------- | |
|
54 | 77 | | Global Batch Size | 8 * 4 * 8 = 256 (as 4 TPUs, 8 Gradient Accumulation Steps) | |
55 | 78 | | Max Epochs | 300 | |
56 | 79 |
|
57 | | -### Config |
| 80 | +**Config:** |
58 | 81 |
|
59 | 82 | ```jinja2 |
60 | 83 | {% import "examples/datasets/vietbud500/sentencepiece/sp.yml.j2" as decoder_config with context %} |
|
63 | 86 | {{config}} |
64 | 87 | ``` |
65 | 88 |
|
66 | | -### Results |
67 | | - |
68 | | -| Training | Image | |
69 | | -| :------------ | :-------------------------------------------------------------- | |
70 | | -| Epoch Loss |  | |
71 | | -| Batch Loss |  | |
72 | | -| Learning Rate |  | |
| 89 | +**Tensorboard:** |
| 90 | + |
| 91 | +<table> |
| 92 | + <tr> |
| 93 | + <td align="center"> |
| 94 | + <img src="./figs/vietbud500-small-streaming-epoch-loss.jpg" width="200px"><br> |
| 95 | + <sub><strong>Epoch Loss</strong></sub> |
| 96 | + </td> |
| 97 | + <td align="center"> |
| 98 | + <img src="./figs/vietbud500-small-streaming-batch-loss.jpg" width="200px"><br> |
| 99 | + <sub><strong>Batch Loss</strong></sub> |
| 100 | + </td> |
| 101 | + <td align="center"> |
| 102 | + <img src="./figs/vietbud500-small-streaming-lr.jpg " width="200px"><br> |
| 103 | + <sub><strong>Learning Rate</strong></sub> |
| 104 | + </td> |
| 105 | + </tr> |
| 106 | +</table> |
| 107 | + |
| 108 | +**Results:** |
73 | 109 |
|
74 | 110 | | Epoch | decoding | wer | cer | mer | wil | wip | |
75 | 111 | | :---- | :------- | :------- | :------- | :------ | :------- | :------- | |
|
0 commit comments