|
1 | 1 | **Table of Contents** |
2 | | -- [LibriSpeech](#librispeech) |
| 2 | +- [\[English\] LibriSpeech](#english-librispeech) |
3 | 3 | - [I. Small + SentencePiece 1k](#i-small--sentencepiece-1k) |
4 | | - - [Training](#training) |
5 | | - - [1. Epoch Loss](#1-epoch-loss) |
6 | | - - [2. Batch Loss](#2-batch-loss) |
7 | | - - [3. Learning Rate](#3-learning-rate) |
8 | | - - [Pretrained Model](#pretrained-model) |
| 4 | + - [Config](#config) |
9 | 5 | - [Results](#results) |
10 | | -- [VietBud500](#vietbud500) |
11 | | - - [I. Small + SentencePiece 1k](#i-small--sentencepiece-1k-1) |
12 | | - - [Training](#training-1) |
13 | | - - [1. Epoch Loss](#1-epoch-loss-1) |
14 | | - - [2. Batch Loss](#2-batch-loss-1) |
15 | | - - [3. Learning Rate](#3-learning-rate-1) |
16 | | - - [Pretrained Model](#pretrained-model-1) |
| 6 | +- [\[Vietnamese\] VietBud500](#vietnamese-vietbud500) |
| 7 | + - [II. Small + Streaming + SentencePiece 1k](#ii-small--streaming--sentencepiece-1k) |
| 8 | + - [Config](#config-1) |
17 | 9 | - [Results](#results-1) |
18 | 10 |
|
| 11 | +<!-- ----------------------------------------------------- EN ------------------------------------------------------ --> |
19 | 12 |
|
20 | | -# LibriSpeech |
| 13 | +# [English] LibriSpeech |
21 | 14 |
|
22 | 15 | ## I. Small + SentencePiece 1k |
23 | 16 |
|
|
30 | 23 | | Global Batch Size | 4 * 4 * 8 = 128 (as 4 TPUs, 8 Gradient Accumulation Steps) | |
31 | 24 | | Max Epochs | 300 | |
32 | 25 |
|
| 26 | +### Config |
33 | 27 |
|
34 | | -### Training |
35 | | - |
36 | | -#### 1. Epoch Loss |
37 | | - |
38 | | - |
39 | | - |
40 | | -#### 2. Batch Loss |
41 | | - |
42 | | - |
43 | | - |
44 | | -#### 3. Learning Rate |
45 | | - |
46 | | - |
47 | | - |
48 | | -### Pretrained Model |
49 | | - |
50 | | -[Link]() |
| 28 | +```jinja2 |
| 29 | +{% import "examples/datasets/librispeech/sentencepiece/sp.yml.j2" as decoder_config with context %} |
| 30 | +{{decoder_config}} |
| 31 | +{% import "examples/models/transducer/conformer/small.yml.j2" as config with context %} |
| 32 | +{{config}} |
| 33 | +``` |
51 | 34 |
|
52 | 35 | ### Results |
53 | 36 |
|
| 37 | +| Epoch | Dataset | decoding | wer | cer | mer | wil | wip | |
| 38 | +| :---- | :--------- | :------- | :------- | :------- | :------- | :------- | :------- | |
| 39 | +| 157 | test-clean | greedy | 0.062918 | 0.025361 | 0.062527 | 0.109992 | 0.890007 | |
| 40 | +| 157 | test-other | greedy | 0.142616 | 0.066839 | 0.140610 | 0.239201 | 0.760798 | |
54 | 41 |
|
55 | | -```json |
56 | | -[ |
57 | | - { |
58 | | - "epoch": 157, |
59 | | - "test-clean": { |
60 | | - "greedy": { |
61 | | - "wer": 0.0629184418746196, |
62 | | - "cer": 0.025361417966113735, |
63 | | - "mer": 0.06252717134486344, |
64 | | - "wil": 0.10999272148964301, |
65 | | - "wip": 0.890007278510357 |
66 | | - } |
67 | | - }, |
68 | | - "test-other": { |
69 | | - "greedy": { |
70 | | - "wer": 0.14261696884015054, |
71 | | - "cer": 0.06683946941977871, |
72 | | - "mer": 0.14061028442267848, |
73 | | - "wil": 0.23920137462664237, |
74 | | - "wip": 0.7607986253733576 |
75 | | - } |
76 | | - } |
77 | | - } |
78 | | -] |
79 | | -``` |
| 42 | +<!-- ----------------------------------------------------- VN ------------------------------------------------------ --> |
80 | 43 |
|
81 | | -# VietBud500 |
| 44 | +# [Vietnamese] VietBud500 |
82 | 45 |
|
83 | | -## I. Small + SentencePiece 1k |
| 46 | +## II. Small + Streaming + SentencePiece 1k |
84 | 47 |
|
85 | 48 | | Category | Description | |
86 | 49 | | :---------------- | :--------------------------------------------------------- | |
87 | | -| Config | [small.yml.j2](../../small.yml.j2) | |
| 50 | +| Config | [small-streaming.yml.j2](../../small-streaming.yml.j2) | |
88 | 51 | | Tensorflow | **2.18.0** | |
89 | 52 | | Device | Google Cloud TPUs v4-8 | |
90 | 53 | | Mixed Precision | strict | |
91 | 54 | | Global Batch Size | 8 * 4 * 8 = 256 (as 4 TPUs, 8 Gradient Accumulation Steps) | |
92 | 55 | | Max Epochs | 300 | |
93 | 56 |
|
94 | | -### Training |
95 | | - |
96 | | -#### 1. Epoch Loss |
97 | | - |
98 | | - |
99 | | - |
100 | | -#### 2. Batch Loss |
101 | | - |
102 | | - |
| 57 | +### Config |
103 | 58 |
|
104 | | -#### 3. Learning Rate |
105 | | - |
106 | | - |
107 | | - |
108 | | -### Pretrained Model |
109 | | - |
110 | | -[Link]() |
| 59 | +```jinja2 |
| 60 | +{% import "examples/datasets/vietbud500/sentencepiece/sp.yml.j2" as decoder_config with context %} |
| 61 | +{{decoder_config}} |
| 62 | +{% import "examples/models/transducer/conformer/small-streaming.yml.j2" as config with context %} |
| 63 | +{{config}} |
| 64 | +``` |
111 | 65 |
|
112 | 66 | ### Results |
113 | 67 |
|
114 | | -```json |
115 | | -[ |
| 68 | +| Training | Image | |
| 69 | +| :------------ | :-------------------------------------------------------------- | |
| 70 | +| Epoch Loss |  | |
| 71 | +| Batch Loss |  | |
| 72 | +| Learning Rate |  | |
| 73 | + |
| 74 | +| Epoch | decoding | wer | cer | mer | wil | wip | |
| 75 | +| :---- | :------- | :------- | :------- | :------ | :------- | :------- | |
| 76 | +| 52 | greedy | 0.053723 | 0.034548 | 0.05362 | 0.086421 | 0.913579 | |
116 | 77 |
|
117 | | -] |
118 | | -``` |
| 78 | +**Pretrained Model**: [Link](https://www.kaggle.com/models/lordh9072/tfasr-vietbud500-conformer-transducer/tensorFlow2/small-streaming) |
0 commit comments