Merge pull request #1960 from NinaARM/dev/audiogen-updates

pareenaverma · web-flow · commit 56ae90a3939f · 2025-05-15T08:00:50.000-05:00
Small updates to align with public codebase structure
diff --git a/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/1-prerequisites.md b/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/1-prerequisites.md
@@ -98,22 +98,24 @@ wget https://dl.google.com/android/repository/android-ndk-r25b-linux.zip
 unzip android-ndk-r25b-linux.zip
   {{< /tab >}}
   {{< tab header="MacOS">}}
-brew install --cask android-studio temurin
+wget https://dl.google.com/android/repository/android-ndk-r25b-darwin.zip
+unzip android-ndk-r25b-darwin
+mv android-ndk-r25b-darwin ~/Library/Android/android-ndk-r25b
   {{< /tab >}}
 {{< /tabpane >}}
 
-For easier access and execution of Android NDK tools, add these to the `PATH` and set the `ANDROID_NDK` variable:
+For easier access and execution of Android NDK tools, add these to the `PATH` and set the `NDK_PATH` variable:
 
 {{< tabpane code=true >}}
   {{< tab header="Linux">}}
-export ANDROID_NDK=$WORKSPACE/android-ndk-r25b/
-export PATH=$ANDROID_NDK/toolchains/llvm/prebuilt/linux-x86_64/bin/:$PATH
+export NDK_PATH=$WORKSPACE/android-ndk-r25b/
+export PATH=$NDK_PATH/toolchains/llvm/prebuilt/linux-x86_64/bin/:$PATH
   {{< /tab >}}
   {{< tab header="MacOS">}}
-export ANDROID_NDK=~/Library/Android/sdk/ndk/27.0.12077973/
-export PATH=$PATH:$ANDROID_NDK/toolchains/llvm/prebuilt/darwin-x86_64/bin
+export NDK_PATH=~/Library/Android/android-ndk-r25b
+export PATH=$PATH:$NDK_PATH/toolchains/llvm/prebuilt/darwin-x86_64/bin
 export PATH=$PATH:~/Library/Android/sdk/cmdline-tools/latest/bin
   {{< /tab >}}
 {{< /tabpane >}}
 
-Now that your development environment is ready and all pre-requisites installed, you can test the Audio Stable Open model.
+Now that your development environment is ready and all pre-requisites installed, you can test the Audio Stable Open Small model.
diff --git a/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/2-testing-model.md b/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/2-testing-model.md
@@ -8,12 +8,12 @@ layout: learningpathall
 
 ## Download the model
 
-Stable Audio Open is an open-source model optimized for generating short audio samples, sound effects, and production elements using text prompts.
+Stable Audio Open Small is an open-source model optimized for generating short audio samples, sound effects, and production elements using text prompts.
 
 [Log in](https://huggingface.co/login) to HuggingFace and navigate to the model landing page:
 
 ```bash
-https://huggingface.co/stabilityai/stable-audio-open-small
+https://huggingface.co/stabilityai/stable-audio-open-small/tree/main
 ```
 
 You may need to fill out a form with your contact information to use the model:
@@ -26,15 +26,9 @@ Download and copy the configuration file `model_config.json` and the model itsel
 ls $WORKSPACE/model_config.json $WORKSPACE/model.ckpt
 ```
 
-## Test the model
+You can see more information about this model [here](https://huggingface.co/stabilityai/stable-audio-open-small).
 
-To test the model, use the Stable Audio demo site, which lets you experiment directly through a web-based interface:
-
-```bash
-https://stableaudio.com/
-```
-
-Use the UI to enter a prompt. A good prompt can include:
+A good prompt for this model can include:
 
 * Music genre and subgenre.
 * Musical elements (texture, rhythm and articulation).
@@ -45,5 +39,5 @@ The order of prompt parameters matters. For more information, see the [Prompt st
 
 You can explore training and inference code for audio generation models in the [Stable Audio Tools repository](https://github.com/Stability-AI/stable-audio-tools).
 
-Now that you've downloaded and tested the model, continue to the next section to convert the model to LiteRT.
+Now that you've downloaded the model, continue to the next section to convert the model to LiteRT.
 
diff --git a/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/3-converting-model.md b/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/3-converting-model.md
@@ -6,7 +6,7 @@ weight: 4
 layout: learningpathall
 ---
 
-## Stable Audio Open Small Model
+## Stable Audio Open Small
 
 |Submodule|Description|
 |------|------|
@@ -17,9 +17,11 @@ layout: learningpathall
 The submodules work together to provide the pipeline as shown below:
 ![Model structure#center](./model.png)
 
-As part of this section, you will covert each of the three submodules into [LiteRT](https://ai.google.dev/edge/litert) format, using two separate conversion routes:
-1. Conditioners submodule - ONNX to LiteRT using [onnx2tf](https://github.com/PINTO0309/onnx2tf) tool.
-2. DiT and AutoEncoder submodules - PyTorch to LiteRT using Google AI Edge Torch tool.
+As part of this section, we will explore two different conversion routes, to convert the submodules to [LiteRT](https://ai.google.dev/edge/litert) format.
+
+1. ONNX --> LiteRT using the onnx2tf tool. This is the traditional two-step approach (PyTorch --> ONNX--> LiteRT). We will use it to convert the Conditioners submodule.
+
+2. PyTorch --> LiteRT using the Google AI Edge Torch tool. We will use this tool to convert the DiT and AutoEncoder submodules.
 
 ### Create virtual environment and install dependencies
 
@@ -37,8 +39,8 @@ Clone the examples repository:
 
 ```bash
 cd $WORKSPACE
-git clone https://github.com/ARM-software/ML-examples/tree/main/kleidiai-examples/audiogen
-cd audio-stale-open-litert
+git clone https://github.com/ARM-software/ML-examples.git
+cd ML-examples/kleidiai-examples/audiogen/
 ```
 
 We now install the needed python packages for this, including *onnx2tf* and *ai_edge_litert*
@@ -61,10 +63,9 @@ ImportError: cannot import name 'AttrsDescriptor' from 'triton.compiler.compiler
 ($WORKSPACE/env/lib/python3.10/site-packages/triton/compiler/compiler.py)
 ```
 
-Install the following dependency and rerun the script:
+Reinstall the following dependency:
 ```bash
 pip install triton==3.2.0
-bash install_requirements.sh
 ```
 
 {{% /notice %}}
@@ -74,7 +75,7 @@ bash install_requirements.sh
 The Conditioners submodule is based on the T5Encoder model. We convert it first to ONNX, then to LiteRT.
 
 For this conversion we include the following steps:
-1. Load the Conditioners submodule from the Stable Audio Open model configuration and checkpoint.
+1. Load the Conditioners submodule from the Stable Audio Open Small model configuration and checkpoint.
 2. Export the Conditioners submodule to ONNX via *torch.onnx.export()*.
 3. Convert the resulting ONNX file to LiteRT using *onnx2tf*.
 
@@ -84,7 +85,9 @@ You can use the provided script to convert the Conditioners submodule:
 python3 ./scripts/export_conditioners.py --model_config "$WORKSPACE/model_config.json" --ckpt_path "$WORKSPACE/model.ckpt"
 ```
 
-After successful conversion, you now have a `conditioners.onnx` model in your current directory.
+After successful conversion, you now have a `tflite_conditioners` directory containing models with different precisions (e.g., float16, float32).
+
+We will be using the float32.tflite model for on-device inference.
 
 ### Convert DiT and AutoEncoder
 
@@ -96,14 +99,12 @@ To convert the DiT and AutoEncoder submodules, use the [Generative API](https://
 
 Convert the DiT and AutoEncoder submodules using the provided python script:
 ```bash
-CUDA_VISIBLE_DEVICES="" python3 ./scripts/export_dit_autoencoder.py --model_config "$WORKSPACE/model_config.json" --ckpt_path "$WORKSPACE/model.ckpt"
+python3 ./scripts/export_dit_autoencoder.py --model_config "$WORKSPACE/model_config.json" --ckpt_path "$WORKSPACE/model.ckpt"
 ```
 
-After successful conversion, you now have `dit_model.tflite` and `autoencoder_model.tflite` models in your current directory and can deactivate the virtual environment:
+After successful conversion, you now have `dit_model.tflite` and `autoencoder_model.tflite` models in your current directory.
 
-```bash
-deactivate
-```
+More detailed explanation of the above scripts is available [here](https://github.com/ARM-software/ML-examples/blob/main/kleidiai-examples/audiogen/scripts/README.md)
 
 For easier access, we add all needed models to one directory:
 
diff --git a/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/4-building-litert.md b/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/4-building-litert.md
@@ -30,42 +30,35 @@ We can use `bazel` to build LiteRT libraries, first we use configure script to c
 
 You can now create a custom TFLite build for android:
 
-Ensure the `ANDROID_NDK` variable is set to your previously installed Android NDK:
+Ensure the `NDK_PATH` variable is set to your previously installed Android NDK:
 {{< tabpane code=true >}}
   {{< tab header="Linux">}}
-export ANDROID_NDK=$WORKSPACE/android-ndk-r25b/
-export PATH=$ANDROID_NDK/toolchains/llvm/prebuilt/linux-x86_64/bin/:$PATH
+export NDK_PATH=$WORKSPACE/android-ndk-r25b/
+export PATH=$NDK_PATH/toolchains/llvm/prebuilt/linux-x86_64/bin/:$PATH
   {{< /tab >}}
   {{< tab header="MacOS">}}
-export TF_CXX_FLAGS="-DTF_MAJOR_VERSION=0 -DTF_MINOR_VERSION=0 -DTF_PATCH_VERSION=0 -DTF_VERSION_SUFFIX=''"
-export ANDROID_NDK=~/Library/Android/sdk/ndk/27.0.12077973/
-export PATH=$PATH:$ANDROID_NDK/toolchains/llvm/prebuilt/darwin-x86_64/bin
-export PATH=$PATH:~/Library/Android/sdk/cmdline-tools/latest/bin
+export NDK_PATH=~/Library/Android/android-ndk-r25b
+export PATH=$PATH:$NDK_PATH/toolchains/llvm/prebuilt/darwin-x86_64/bin
   {{< /tab >}}
 {{< /tabpane >}}
-
-Set the TensorFlow version
-
-```bash
-export TF_CXX_FLAGS="-DTF_MAJOR_VERSION=0 -DTF_MINOR_VERSION=0 -DTF_PATCH_VERSION=0 -DTF_VERSION_SUFFIX=''"
-```
-
-
 Now you can configure TensorFlow. Here you can set the custom build parameters needed as follows:
 
-```bash { output_lines = "2-14" }
+```bash { output_lines = "2-17" }
 python3 ./configure.py
 Please specify the location of python. [Default is $WORKSPACE/bin/python3]:
 Please input the desired Python library path to use. Default is [$WORKSPACE/lib/python3.10/site-packages]
 Do you wish to build TensorFlow with ROCm support? [y/N]: n
 Do you wish to build TensorFlow with CUDA support? [y/N]: n
 Do you want to use Clang to build TensorFlow? [Y/n]: n
 Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: y
-Please specify the home path of the Android NDK to use. [Default is /home/user/Android/Sdk/ndk-bundle]: /home/user/Workspace/tools/ndk/android-ndk-r25b
+Please specify the home path of the Android NDK to use. [Default is /home/user/Android/Sdk/ndk-bundle]:
 Please specify the (min) Android NDK API level to use. [Available levels: [16, 17, 18, 19, 21, 22, 23, 24, 26, 27, 28, 29, 30, 31, 32, 33]] [Default is 21]: 30
 Please specify the home path of the Android SDK to use. [Default is /home/user/Android/Sdk]:
 Please specify the Android SDK API level to use. [Available levels: ['31', '33', '34', '35']] [Default is 35]:
 Please specify an Android build tools version to use. [Available versions: ['30.0.3', '34.0.0', '35.0.0']] [Default is 35.0.0]:
+Do you wish to build TensorFlow with iOS support? [y/N]: n
+
+Configuration finished
 ```
 
 Once the bazel configuration is complete, you can build TFLite as follows:
@@ -77,7 +70,15 @@ bazel build -c opt --config android_arm64 //tensorflow/lite:libtensorflowlite.so
     --define tflite_with_xnnpack_qu8=true
 ```
 
-This will produce a `libtensorflowlite.so` shared library for android with XNNPack enabled, which we will use to build the example next.
+We also build flatbuffers used by the application in the next steps:
+```
+cd $WORKSPACE/tensorflow_src
+mkdir flatc-native-build && cd flatc-native-build
+cmake ../tensorflow/lite/tools/cmake/native_tools/flatbuffers
+cmake --build .
+```
+
+With flatbuffers and LiteRT built, we can now build our application for Android device.
 
 
 
diff --git a/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/5-creating-simple-program.md b/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/5-creating-simple-program.md
@@ -12,7 +12,7 @@ You'll now build a simple program that runs inference on all three submodules di
 
 The program takes a text prompt as input and generates an audio file as output.
 ```bash
-cd $WORKSPACE/audio-stale-open-litert/app
+cd $WORKSPACE/ML-examples/kleidiai-examples/audiogen/app
 mkdir build && cd build
 ```
 
@@ -26,34 +26,22 @@ cmake -DCMAKE_TOOLCHAIN_FILE=$NDK_PATH/build/cmake/android.toolchain.cmake \
       -DFLATBUFFER_INCLUDE_PATH=$TF_SRC_PATH/flatc-native-build/flatbuffers/include \
     ..
 
-cmake --build . -j1
+make -j
 ```
+After the example application builds successfully, a binary file named `audiogen` is created.
 
-Since the tokenizer used in the audiogen application is based on SentencePiece, you’ll need to download the spiece.model file from:
+A SentencePiece model is a type of subword tokenizer which is used by the audiogen application, you’ll need to download the *spiece.model* file from:
 ```bash
 https://huggingface.co/google-t5/t5-base/tree/main
 ```
-we will save this model in `WORKSPACE` for ease of access.
+we will save this model in `WORKSPACE` for ease of access
 ```text
-cp spiece.moel $WORKSPACE
+cp spiece.model $WORKSPACE
 ```
-After the SAO example builds successfully, a binary file named `audiogen_main` is created.
 
-Now use adb (Android Debug Bridge) to push the necessary files to the device:
-
-```bash
-adb shell
-```
-
-Create a directory for all the required resources:
+Now use adb (Android Debug Bridge) to push all necessary files into the `audiogen` folder on Android device:
 ```bash
-cd /data/local/tmp
-mkdir audiogen
-exit
-```
-Push all necessary files into the `audiogen` folder on Android:
-```bash
-cd $WORKSPACE/audio-stale-open-litert/app/build
+cd $WORKSPACE/ML-examples/kleidiai-examples/audiogen/app/build
 adb shell mkdir -p /data/local/tmp/app
 adb push audiogen /data/local/tmp/app
 adb push $LITERT_MODELS_PATH/conditioners_float32.tflite /data/local/tmp/app
@@ -68,6 +56,7 @@ Finally, run the program on your Android device:
 adb shell
 cd /data/local/tmp/app
 LD_LIBRARY_PATH=. ./audiogen . "warm arpeggios on house beats 120BPM with drums effect" 4
+exit
 ```
 
 The successful execution of the app will create `output.wav` of your chosen audio defined by the prompt, you can pull it back to your host machine and enjoy!
diff --git a/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/_index.md b/content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/_index.md
@@ -7,11 +7,10 @@ who_is_this_for: This is an introductory topic for developers looking to deploy
 
 learning_objectives:
     - Deploy the Stable Audio Open Small model on Android using LiteRT.
-    - Build a simple program to generate audio.
-    - Compile the application and for an Arm CPU.
+    - Create a simple application to generate audio.
+    - Compile the application for an Arm CPU.
     - Run the application on an Android smartphone and generate an audio snippet.
 
-
 prerequisites:
     - A Linux-based x86 or macOS development machine with at least 8 GB of RAM (tested on Ubuntu 20.04.4 LTS with x86_64).
     - A [HuggingFace](https://huggingface.co/) account.
@@ -38,9 +37,9 @@ operatingsystems:
 
 further_reading:
     - resource:
-        title: Introducing Stable Audio 2.0
-        link: https://stability.ai/news/stable-audio-2-0
-        type: documentation
+        title: Stability AI and Arm Collaborate to Release Stable Audio Open Small, Enabling Real-World Deployment for On-Device Audio Generation
+        link: https://stability.ai/news/stability-ai-and-arm-release-stable-audio-open-small-enabling-real-world-deployment-for-on-device-audio-control
+        type: blog
     - resource:
         title: Stability AI optimized its audio generation model to run on Arm chips
         link: https://techcrunch.com/2025/03/03/stability-ai-optimized-its-audio-generation-model-to-run-on-arm-chips/