update notebooks and sample.json

alexsin368 · alexsin368 · commit 921f534383c2 · 2024-11-27T14:49:13.000-08:00
diff --git a/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/Inference/lang_id_inference.ipynb b/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/Inference/lang_id_inference.ipynb
@@ -47,15 +47,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!python inference_commonVoice.py -p /data/commonVoice/test"
+    "!python inference_commonVoice.py -p ${COMMON_VOICE_PATH}/processed_data/test"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "## inference_custom.py for Custom Data  \n",
-    "To generate an overall results output summary, the audio_ground_truth_labels.csv file needs to be modified with the name of the audio file and expected audio label (i.e. en for English). By default, this is disabled but if desired, the *--ground_truth_compare* can be used. To run inference on custom data, you must specify a folder with WAV files and pass the path in as an argument.  "
+    "To run inference on custom data, you must specify a folder with .wav files and pass the path in as an argument. You can do so by creating a folder named `data_custom` and then copy 1 or 2 .wav files from your test dataset into it. .mp3 files will NOT work. "
    ]
   },
   {
@@ -65,7 +65,7 @@
     "### Randomly select audio clips from audio files for prediction\n",
     "python inference_custom.py -p DATAPATH -d DURATION -s SIZE\n",
     "\n",
-    "An output file output_summary.csv will give the summary of the results."
+    "An output file `output_summary.csv` will give the summary of the results."
    ]
   },
   {
@@ -104,6 +104,8 @@
     "### Optimizations with Intel® Extension for PyTorch (IPEX) \n",
     "python inference_custom.py -p data_custom -d 3 -s 50 --vad --ipex --verbose  \n",
     "\n",
+    "This will apply ipex.optimize to the model(s) and TorchScript. You can also add the --bf16 option along with --ipex to run in the BF16 data type, supported on 4th Gen Intel® Xeon® Scalable processors and newer.\n",
+    "\n",
     "Note that the *--verbose* option is required to view the latency measurements.   "
    ]
   },
@@ -121,7 +123,7 @@
    "metadata": {},
    "source": [
     "## Quantization with Intel® Neural Compressor (INC)\n",
-    "To improve inference latency, Intel® Neural Compressor (INC) can be used to quantize the trained model from FP32 to INT8 by running quantize_model.py. The *-datapath* argument can be used to specify a custom evaluation dataset but by default it is set to */data/commonVoice/dev* which was generated from the data preprocessing scripts in the *Training* folder.  "
+    "To improve inference latency, Intel® Neural Compressor (INC) can be used to quantize the trained model from FP32 to INT8 by running quantize_model.py. The *-datapath* argument can be used to specify a custom evaluation dataset but by default it is set to `$COMMON_VOICE_PATH/processed_data/dev` which was generated from the data preprocessing scripts in the `Training` folder.  "
    ]
   },
   {
@@ -137,7 +139,39 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "After quantization, the model will be stored in *lang_id_commonvoice_model_INT8* and *neural_compressor.utils.pytorch.load* will have to be used to load the quantized model for inference.  "
+    "After quantization, the model will be stored in lang_id_commonvoice_model_INT8 and neural_compressor.utils.pytorch.load will have to be used to load the quantized model for inference. If self.language_id is the original model and data_path is the path to the audio file:\n",
+    "\n",
+    "```\n",
+    "from neural_compressor.utils.pytorch import load\n",
+    "model_int8 = load(\"./lang_id_commonvoice_model_INT8\", self.language_id)\n",
+    "signal = self.language_id.load_audio(data_path)\n",
+    "prediction = self.model_int8(signal)\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The code above is integrated into inference_custom.py. You can now run inference on your data using this INT8 model:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python inference_custom.py -p data_custom -d 3 -s 50 --vad --int8_model --verbose"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### (Optional) Comparing Predictions with Ground Truth\n",
+    "\n",
+    "You can choose to modify audio_ground_truth_labels.csv to include the name of the audio file and expected audio label (like, en for English), then run inference_custom.py with the --ground_truth_compare option. By default, this is disabled."
    ]
   },
   {
diff --git a/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/README.md b/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/README.md
@@ -123,7 +123,7 @@ First, change to the `Training` directory.
 cd /Training
 ```
 
-### Run in Jupyter Notebook
+### Option 1: Run in Jupyter Notebook
 
 1. Install Jupyter Notebook.
    ```
@@ -141,7 +141,7 @@ cd /Training
 5. Follow the instructions in the Notebook.
 
 
-### Run in a Console
+### Option 2: Run in a Console
 
 If you cannot or do not want to use Jupyter Notebook, use these procedures to run the sample and scripts locally.
 
@@ -264,7 +264,7 @@ To run inference, you must have already run all of the training scripts, generat
    cd /Inference
    ```
 
-### Run in Jupyter Notebook
+### Option 1: Run in Jupyter Notebook
 
 1. If you have not already done so, install Jupyter Notebook.
    ```
@@ -281,7 +281,7 @@ To run inference, you must have already run all of the training scripts, generat
    ```
 5. Follow the instructions in the Notebook.
 
-### Run in a Console
+### Option 2: Run in a Console
 
 If you cannot or do not want to use Jupyter Notebook, use these procedures to run the sample and scripts locally.
 
diff --git a/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/Training/lang_id_training.ipynb b/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/Training/lang_id_training.ipynb
@@ -110,7 +110,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Note down the shard with the largest number as LARGEST_SHARD_NUMBER in the output above or by navigating to *${COMMON_VOICE_PATH}/processed_data/commonVoice_shards/train*. In *train_ecapa.yaml*, modify the *train_shards* variable to go from 000000..LARGEST_SHARD_NUMBER. Repeat the process for *${COMMON_VOICE_PATH}/processed_data/commonVoice_shards/dev*.  "
+    "Note down the shard with the largest number as LARGEST_SHARD_NUMBER in the output above or by navigating to `${COMMON_VOICE_PATH}/processed_data/commonVoice_shards/train`. In `train_ecapa.yaml`, modify the `train_shards` variable to go from 000000..LARGEST_SHARD_NUMBER. Repeat the process for `${COMMON_VOICE_PATH}/processed_data/commonVoice_shards/dev`.  "
    ]
   },
   {
@@ -167,20 +167,20 @@
    "outputs": [],
    "source": [
     "# 1)\n",
-    "cp -R results/epaca/1987 ../Inference/lang_id_commonvoice_model\n",
+    "!cp -R results/epaca/1987 ../Inference/lang_id_commonvoice_model\n",
     "\n",
     "# 2)\n",
-    "cd ../Inference/lang_id_commonvoice_model/save\n",
+    "!cd ../Inference/lang_id_commonvoice_model/save\n",
     "\n",
     "# 3)\n",
-    "cp label_encoder.txt ../.\n",
+    "!cp label_encoder.txt ../.\n",
     "\n",
     "# 4)\n",
     "# Navigate into the CKPT folder\n",
-    "cd CKPT<DATE_OF_RUN> #@TODO: set this to your CKPT folder\n",
-    "cp classifier.ckpt ../../.\n",
-    "cp embedding_model.ckpt ../../\n",
-    "cd ../.."
+    "!cd CKPT<DATE_OF_RUN> #@TODO: set this to your CKPT folder\n",
+    "!cp classifier.ckpt ../../.\n",
+    "!cp embedding_model.ckpt ../../\n",
+    "!cd ../.."
    ]
   },
   {
diff --git a/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/sample.json b/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification/sample.json
@@ -14,6 +14,34 @@
         "env": [
         ],
         "steps": [
+          "export COMMON_VOICE_PATH=/data/commonVoice",
+          "sudo apt-get update && apt-get install ffmpeg libgl1",
+          "git clone https://github.com/oneapi-src/oneAPI-samples.git",
+          "cd oneAPI-samples/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification",
+          "source initialize.sh",
+          "cd /Training",
+          "cp speechbrain/recipes/VoxLingua107/lang_id/create_wds_shards.py create_wds_shards.py",
+          "cp speechbrain/recipes/VoxLingua107/lang_id/train.py train.py",
+          "cp speechbrain/recipes/VoxLingua107/lang_id/hparams/train_ecapa.yaml train_ecapa.yaml",
+          "patch < create_wds_shards.patch",
+          "patch < train_ecapa.patch",
+          "python prepareAllCommonVoice.py -path $COMMON_VOICE_PATH -max_samples 2000 --createCsv --train --dev --test",
+          "python create_wds_shards.py ${COMMON_VOICE_PATH}/processed_data/train ${COMMON_VOICE_PATH}/processed_data/commonVoice_shards/train",
+          "python create_wds_shards.py ${COMMON_VOICE_PATH}/processed_data/dev ${COMMON_VOICE_PATH}/processed_data/commonVoice_shards/dev",
+          "python train.py train_ecapa.yaml --device cpu",
+          "cp -R results/epaca/1987 ../Inference/lang_id_commonvoice_model",
+          "cd ../Inference/lang_id_commonvoice_model/save",
+          "cp label_encoder.txt ../.",
+          "cd CKPT<DATE_OF_RUN>",
+          "cp classifier.ckpt ../../.",
+          "cp embedding_model.ckpt ../../",
+          "cd ../..",
+          "cd /Inference",
+          "python inference_commonVoice.py -p ${COMMON_VOICE_PATH}/processed_data/test",
+          "python inference_custom.py -p data_custom -d 3 -s 50 --vad",
+          "python inference_custom.py -p data_custom -d 3 -s 50 --vad --ipex --verbose",
+          "python quantize_model.py -p ./lang_id_commonvoice_model -datapath $COMMON_VOICE_PATH/processed_data/dev",
+          "python inference_custom.py -p data_custom -d 3 -s 50 --vad --int8_model --verbose"
         ]
       }
     ]