Skip to content

Commit aedb5de

Browse files
committed
clean up README/notebooks, reduce to 3 epochs, update JSON
1 parent 3f332de commit aedb5de

File tree

6 files changed

+27
-22
lines changed

6 files changed

+27
-22
lines changed

AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/Inference/lang_id_inference.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@
132132
"metadata": {},
133133
"outputs": [],
134134
"source": [
135-
"!python quantize_model.py -p ./lang_id_commonvoice_model -datapath $COMMON_VOICE_PATH/dev"
135+
"!python quantize_model.py -p ./lang_id_commonvoice_model -datapath $COMMON_VOICE_PATH/processed_data/dev"
136136
]
137137
},
138138
{

AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/README.md

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Languages are selected from the CommonVoice dataset for training, validation, an
66

77
| Area | Description
88
|:--- |:---
9-
| What you will learn | How to use training and inference with SpeechBrain, Intel® Extension for PyTorch (IPEX) inference, Intel® Neural Compressor (INC) quantization, and a oneapi-aikit container
9+
| What you will learn | How to use training and inference with SpeechBrain, Intel® Extension for PyTorch* (IPEX) inference, Intel® Neural Compressor (INC) quantization
1010
| Time to complete | 60 minutes
1111

1212
## Purpose
@@ -18,8 +18,8 @@ Spoken audio comes in different languages and this sample uses a model to identi
1818
| Optimized for | Description
1919
|:--- |:---
2020
| OS | Ubuntu* 22.04 or newer
21-
| Hardware | Intel® Xeon® processor family
22-
| Software | Intel® OneAPI AI Analytics Toolkit <br> Hugging Face SpeechBrain
21+
| Hardware | Intel® Xeon® and Core® processor families
22+
| Software | Intel® AI Tools <br> Hugging Face SpeechBrain
2323

2424
## Key Implementation Details
2525

@@ -41,15 +41,14 @@ For both training and inference, you can run the sample and scripts in Jupyter N
4141

4242
1. Create your conda environment by following the instructions on the Intel [AI Tools Selector](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-tools-selector.html). You can follow these settings:
4343

44-
* AI Tools
45-
* Preset: Inference Optimization
46-
* Distribution Type: conda*
44+
* Tool: AI Tools
45+
* Preset or customize: Customize
46+
* Distribution Type: conda* or pip
4747
* Python Versions: Python* 3.9 or 3.10
48+
* PyTorch* Framework Optimizations: Intel® Extension for PyTorch* (CPU)
49+
* Intel®-Optimized Tools & Libraries: Intel® Neural Compressor
4850

49-
Then activate your environment:
50-
```bash
51-
conda activate <your-env-name>
52-
```
51+
>**Note**: Be sure to activate your environment before installing the packages. If using pip, install using `python -m pip` instead of just `pip`.
5352
5453
2. Create your dataset folder and set the environment variable `COMMON_VOICE_PATH`. This needs to match with where you downloaded your dataset.
5554
```bash
@@ -221,7 +220,7 @@ After training, the output should be inside the `results/epaca/1987` folder. By
221220

222221
cp classifier.ckpt ../../.
223222
cp embedding_model.ckpt ../../
224-
cd ../..
223+
cd ../../../..
225224
```
226225

227226
You may need to modify the permissions of these files to be executable i.e. `sudo chmod 755` before you run the inference scripts to consume them.
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
#!/bin/bash
22

33
echo "Deleting rir, noise, speechbrain"
4-
rm -R rir noise speechbrain
4+
rm -R rir noise

AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/Training/lang_id_training.ipynb

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@
2929
"metadata": {},
3030
"outputs": [],
3131
"source": [
32-
"!cp speechbrain/recipes/VoxLingua107/lang_id/create_wds_shards.py create_wds_shards.py\n",
33-
"!cp speechbrain/recipes/VoxLingua107/lang_id/train.py train.py\n",
34-
"!cp speechbrain/recipes/VoxLingua107/lang_id/hparams/train_ecapa.yaml train_ecapa.yaml"
32+
"!cp ../speechbrain/recipes/VoxLingua107/lang_id/create_wds_shards.py create_wds_shards.py\n",
33+
"!cp ../speechbrain/recipes/VoxLingua107/lang_id/train.py train.py\n",
34+
"!cp ../speechbrain/recipes/VoxLingua107/lang_id/hparams/train_ecapa.yaml train_ecapa.yaml"
3535
]
3636
},
3737
{
@@ -166,21 +166,26 @@
166166
"metadata": {},
167167
"outputs": [],
168168
"source": [
169+
"import os\n",
170+
"\n",
169171
"# 1)\n",
170172
"!cp -R results/epaca/1987 ../Inference/lang_id_commonvoice_model\n",
171173
"\n",
172174
"# 2)\n",
173-
"!cd ../Inference/lang_id_commonvoice_model/save\n",
175+
"os.chdir(\"../Inference/lang_id_commonvoice_model/save\")\n",
174176
"\n",
175177
"# 3)\n",
176178
"!cp label_encoder.txt ../.\n",
177179
"\n",
178-
"# 4)\n",
179-
"# Navigate into the CKPT folder\n",
180-
"!cd CKPT* # Set this to your CKPT folder. By default it will navigate into the one that is present.\n",
180+
"# 4) \n",
181+
"folders = os.listdir()\n",
182+
"for folder in folders:\n",
183+
" if \"CKPT\" in folder:\n",
184+
" os.chdir(folder)\n",
185+
" break\n",
181186
"!cp classifier.ckpt ../../.\n",
182187
"!cp embedding_model.ckpt ../../\n",
183-
"!cd ../.."
188+
"os.chdir(\"../../../..\")"
184189
]
185190
},
186191
{

AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/Training/train_ecapa.patch

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727

2828
# Training parameters
2929
-number_of_epochs: 40
30-
+number_of_epochs: 10
30+
+number_of_epochs: 3
3131
lr: 0.001
3232
lr_final: 0.0001
3333
sample_rate: 16000

AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/sample.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
"export COMMON_VOICE_PATH=/data/commonVoice"
1616
],
1717
"steps": [
18+
"mkdir -p /data/commonVoice",
1819
"apt-get update && apt-get install ffmpeg libgl1 -y",
1920
"source initialize.sh",
2021
"cd ./Dataset",

0 commit comments

Comments
 (0)