You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/1-prerequisites.md
+11-9Lines changed: 11 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,7 @@ Your first task is to prepare a development environment with the required softwa
19
19
20
20
### Create workspace directory
21
21
22
-
Create a separate directory for all dependencies and repositories that this Learning Path uses.
22
+
Create a separate directory for all the dependencies and repositories that this Learning Path uses.
23
23
24
24
Export the `WORKSPACE` variable to point to this directory, which you will use in the following steps:
25
25
@@ -74,7 +74,7 @@ See the [CMake install guide](/install-guides/cmake/) for troubleshooting instru
74
74
75
75
### Install Bazel
76
76
77
-
Bazel is an open-source build tool which we will use to build LiteRT libraries.
77
+
Bazel is an open-source build tool which you will use to build LiteRT libraries.
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/2-testing-model.md
+6-10Lines changed: 6 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,12 +8,12 @@ layout: learningpathall
8
8
9
9
## Download the model
10
10
11
-
Stable Audio Open is an open-source model optimized for generating short audio samples, sound effects, and production elements using text prompts.
11
+
Stable Audio Open Small is an open-source model optimized for generating short audio samples, sound effects, and production elements using text prompts.
12
12
13
13
[Log in](https://huggingface.co/login) to HuggingFace and navigate to the model landing page:
You may need to fill out a form with your contact information to use the model:
@@ -26,15 +26,11 @@ Download and copy the configuration file `model_config.json` and the model itsel
26
26
ls $WORKSPACE/model_config.json $WORKSPACE/model.ckpt
27
27
```
28
28
29
-
## Test the model
29
+
You can learn more about this model[here](https://huggingface.co/stabilityai/stable-audio-open-small).
30
30
31
-
To test the model, use the Stable Audio demo site, which lets you experiment directly through a web-based interface:
31
+
### Good prompting practices
32
32
33
-
```bash
34
-
https://stableaudio.com/
35
-
```
36
-
37
-
Use the UI to enter a prompt. A good prompt can include:
33
+
A good prompt for the Stable Audio Open Small model can include the following elements:
38
34
39
35
* Music genre and subgenre.
40
36
* Musical elements (texture, rhythm and articulation).
@@ -45,5 +41,5 @@ The order of prompt parameters matters. For more information, see the [Prompt st
45
41
46
42
You can explore training and inference code for audio generation models in the [Stable Audio Tools repository](https://github.com/Stability-AI/stable-audio-tools).
47
43
48
-
Now that you've downloaded and tested the model, continue to the next section to convert the model to LiteRT.
44
+
Now that you've downloaded the model, you're ready to convert it to LiteRT format in the next step.
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/3-converting-model.md
+32-26Lines changed: 32 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,31 +1,37 @@
1
1
---
2
-
title: Convert Open Stable Audio Small model to LiteRT
2
+
title: Convert Stable Audio Open Small model to LiteRT
3
3
weight: 4
4
4
5
5
### FIXED, DO NOT MODIFY
6
6
layout: learningpathall
7
7
---
8
+
In this section, you will learn about the audio generation model. You will then clone a repository that contains the scripts required to convert the model submodules into LiteRT format and generate the inference application.
8
9
9
-
## Stable Audio Open Small Model
10
+
## Stable Audio Open Small
11
+
12
+
The open-source model consists of three main submodules. They are described in the table below, and come together through the pipeline shown in the image.
10
13
11
14
|Submodule|Description|
12
15
|------|------|
13
16
|Conditioners| Includes a T5-based text encoder for the input prompt and a numerical duration encoder. These components convert the inputs into embeddings passed to the DiT model. |
14
17
|Diffusion Transformer (DiT)| Denoises random noise over multiple steps to produce structured latent audio, guided by conditioner embeddings. |
15
18
|AutoEncoder| Compresses audio waveforms into a latent representation for processing by the DiT model, and decompresses the output back into audio. |
16
19
17
-
The submodules work together to provide the pipeline as shown below:
20
+
18
21

19
22
20
-
As part of this section, you will covert each of the three submodules into [LiteRT](https://ai.google.dev/edge/litert) format, using two separate conversion routes:
21
-
1. Conditioners submodule - ONNX to LiteRT using [onnx2tf](https://github.com/PINTO0309/onnx2tf) tool.
22
-
2. DiT and AutoEncoder submodules - PyTorch to LiteRT using Google AI Edge Torch tool.
23
+
In this section, you will explore two different conversion routes, to convert the submodules to [LiteRT](https://ai.google.dev/edge/litert) format. Both methods will be run using Python wrapper scripts from the examples repository.
24
+
25
+
1.**ONNX to LiteRT**: using the `onnx2tf` tool. This is the traditional two-step approach (PyTorch -> ONNX -> LiteRT). You will use it to convert the Conditioners submodule.
26
+
27
+
2.**PyTorch to LiteRT**: using the Google AI Edge Torch tool. You will use this tool to convert the DiT and AutoEncoder submodules.
28
+
23
29
24
-
### Create virtual environment and install dependencies
30
+
##Download the sample code
25
31
26
32
The Conditioners submodule is made of the T5Encoder model. You will use the ONNX to TFLite conversion for this submodule.
27
33
28
-
To avoid dependency issues, create a virtual environment. In this guide, we will use `virtualenv`:
34
+
To avoid dependency issues, create a virtual environment. For example, you can use the following command:
29
35
30
36
```bash
31
37
cd$WORKSPACE
@@ -37,11 +43,11 @@ Clone the examples repository:
After successful conversion, you now have a `conditioners.onnx` model in your current directory.
92
+
After successful conversion, you now have a `tflite_conditioners` directory containing models with different precisions (e.g., float16, float32).
93
+
94
+
You will be using the float32.tflite model for on-device inference.
88
95
89
96
### Convert DiT and AutoEncoder
90
97
91
-
To convert the DiT and AutoEncoder submodules, use the [Generative API](https://github.com/google-ai-edge/ai-edge-torch/tree/main/ai_edge_torch/generative/) provided by the ai-edge-torch tools. This enables you to export a generative PyTorch model directly to tflite using three main steps:
98
+
To convert the DiT and AutoEncoder submodules, use the [Generative API](https://github.com/google-ai-edge/ai-edge-torch/tree/main/ai_edge_torch/generative/) provided by the ai-edge-torch tools. This enables you to export a generative PyTorch model directly to `.tflite` using three main steps:
92
99
93
100
1. Model re-authoring.
94
101
2. Quantization.
95
102
3. Conversion.
96
103
97
-
Convert the DiT and AutoEncoder submodules using the provided python script:
104
+
Convert the DiT and AutoEncoder submodules using the provided Python script:
After successful conversion, you now have `dit_model.tflite` and `autoencoder_model.tflite` models in your current directory and can deactivate the virtual environment:
110
+
After successful conversion, you now have `dit_model.tflite` and `autoencoder_model.tflite` models in your current directory.
103
111
104
-
```bash
105
-
deactivate
106
-
```
112
+
A more detailed explanation of the above scripts is available [here](https://github.com/ARM-software/ML-examples/blob/main/kleidiai-examples/audiogen/scripts/README.md).
107
113
108
-
For easier access, we add all needed models to one directory:
114
+
For easy access, add all the required models to one directory:
With all three submodules converted to LiteRT format, you're ready to build LiteRT and run the model on a mobile device in the next step.
124
+
With all three submodules now converted to LiteRT format, you're ready to build the runtime and run Stable Audio Open Small directly on an Android device in the next step.
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/4-building-litert.md
+37-33Lines changed: 37 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ layout: learningpathall
8
8
9
9
## LiteRT
10
10
11
-
LiteRT (short for Lite Runtime), formerly known as TensorFlow Lite, is Google's high-performance runtime for on-device AI.
11
+
LiteRT (short for Lite Runtime), formerly known as TensorFlow Lite, is Google's high-performance runtime for on-device AI. Designed for low-latency, resource-efficient execution, LiteRT is optimized for mobile and embedded environments — making it a natural fit for Arm CPUs running models like Stable Audio Open Small. You’ll build the runtime using the Bazel build tool.
Now you can configure TensorFlow. Here you can set the custom build parameters needed as follows:
55
-
56
-
```bash { output_lines = "2-14" }
57
48
python3 ./configure.py
58
-
Please specify the location of python. [Default is $WORKSPACE/bin/python3]:
59
-
Please input the desired Python library path to use. Default is [$WORKSPACE/lib/python3.10/site-packages]
60
-
Do you wish to build TensorFlow with ROCm support? [y/N]: n
61
-
Do you wish to build TensorFlow with CUDA support? [y/N]: n
62
-
Do you want to use Clang to build TensorFlow? [Y/n]: n
63
-
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: y
64
-
Please specify the home path of the Android NDK to use. [Default is /home/user/Android/Sdk/ndk-bundle]: /home/user/Workspace/tools/ndk/android-ndk-r25b
65
-
Please specify the (min) Android NDK API level to use. [Available levels: [16, 17, 18, 19, 21, 22, 23, 24, 26, 27, 28, 29, 30, 31, 32, 33]] [Default is 21]: 30
66
-
Please specify the home path of the Android SDK to use. [Default is /home/user/Android/Sdk]:
67
-
Please specify the Android SDK API level to use. [Available levels: ['31', '33', '34', '35']] [Default is 35]:
68
-
Please specify an Android build tools version to use. [Available versions: ['30.0.3', '34.0.0', '35.0.0']] [Default is 35.0.0]:
69
49
```
70
50
71
-
Once the bazel configuration is complete, you can build TFLite as follows:
51
+
|Question|Input|
52
+
|---|---|
53
+
|Please specify the location of python. [Default is $WORKSPACE/bin/python3]:| Enter (default) |
54
+
|Please input the desired Python library path to use[$WORKSPACE/lib/python3.10/site-packages]| Enter |
55
+
|Do you wish to build TensorFlow with ROCm support? [y/N]|N (No)|
56
+
|Do you wish to build TensorFlow with CUDA support?|N|
57
+
|Do you want to use Clang to build TensorFlow? [Y/n]|N|
58
+
|Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]|y (Yes) |
59
+
|Please specify the home path of the Android NDK to use. [Default is /home/user/Android/Sdk/ndk-bundle]| Enter |
60
+
|Please specify the (min) Android NDK API level to use. [Default is 21]| 27 |
61
+
|Please specify the home path of the Android SDK to use. [Default is /home/user/Android/Sdk]| Enter |
62
+
|Please specify the Android SDK API level to use. [Default is 35]| Enter |
63
+
|Please specify an Android build tools version to use. [Default is 35.0.0]| Enter |
64
+
|Do you wish to build TensorFlow with iOS support? [y/N]:| n |
65
+
66
+
Once the Bazel configuration is complete, you can build TFLite as follows:
Now that LiteRT and FlatBuffers are built, you're ready to compile and deploy the Stable Audio Open Small inference application on your Android device.
0 commit comments