You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/1-prerequisites.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -74,7 +74,7 @@ See the [CMake install guide](/install-guides/cmake/) for troubleshooting instru
74
74
75
75
### Install Bazel
76
76
77
-
Bazel is an open-source build tool which we will use to build LiteRT libraries.
77
+
Bazel is an open-source build tool which you will use to build LiteRT libraries.
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/2-testing-model.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,9 +26,11 @@ Download and copy the configuration file `model_config.json` and the model itsel
26
26
ls $WORKSPACE/model_config.json $WORKSPACE/model.ckpt
27
27
```
28
28
29
-
You can see more information about this model [here](https://huggingface.co/stabilityai/stable-audio-open-small).
29
+
You can learn more about this model [here](https://huggingface.co/stabilityai/stable-audio-open-small).
30
30
31
-
A good prompt for this model can include:
31
+
### Good prompting practices
32
+
33
+
A good prompt for this audio generation model can include:
32
34
33
35
* Music genre and subgenre.
34
36
* Musical elements (texture, rhythm and articulation).
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/3-converting-model.md
+19-14Lines changed: 19 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,29 +5,33 @@ weight: 4
5
5
### FIXED, DO NOT MODIFY
6
6
layout: learningpathall
7
7
---
8
+
In this section, you will learn about the audio generation model. You will then clone a repository to run conversion steps, which is needed to generate the inference application.
8
9
9
10
## Stable Audio Open Small
10
11
12
+
The open-sourced model includes three main parts. They are described in the table below, and come together through the pipeline shown in the image.
13
+
11
14
|Submodule|Description|
12
15
|------|------|
13
16
|Conditioners| Includes a T5-based text encoder for the input prompt and a numerical duration encoder. These components convert the inputs into embeddings passed to the DiT model. |
14
17
|Diffusion Transformer (DiT)| Denoises random noise over multiple steps to produce structured latent audio, guided by conditioner embeddings. |
15
18
|AutoEncoder| Compresses audio waveforms into a latent representation for processing by the DiT model, and decompresses the output back into audio. |
16
19
17
-
The submodules work together to provide the pipeline as shown below:
20
+
18
21

19
22
20
-
As part of this section, we will explore two different conversion routes, to convert the submodules to [LiteRT](https://ai.google.dev/edge/litert) format.
23
+
In this section, you will explore two different conversion routes, to convert the submodules to [LiteRT](https://ai.google.dev/edge/litert) format. Both methods will be run using Python wrapper scripts from the examples repository.
21
24
22
-
1. ONNX --> LiteRT using the onnx2tf tool. This is the traditional two-step approach (PyTorch --> ONNX--> LiteRT). We will use it to convert the Conditioners submodule.
25
+
1.**ONNX to LiteRT**: using the `onnx2tf` tool. This is the traditional two-step approach (PyTorch -> ONNX -> LiteRT). You will use it to convert the Conditioners submodule.
23
26
24
-
2. PyTorch --> LiteRT using the Google AI Edge Torch tool. We will use this tool to convert the DiT and AutoEncoder submodules.
27
+
2.**PyTorch to LiteRT**: using the Google AI Edge Torch tool. You will use this tool to convert the DiT and AutoEncoder submodules.
25
28
26
-
### Create virtual environment and install dependencies
29
+
30
+
## Download the sample code
27
31
28
32
The Conditioners submodule is made of the T5Encoder model. You will use the ONNX to TFLite conversion for this submodule.
29
33
30
-
To avoid dependency issues, create a virtual environment. In this guide, we will use `virtualenv`:
34
+
To avoid dependency issues, create a virtual environment. For example, you can use the following command:
After successful conversion, you now have a `tflite_conditioners` directory containing models with different precisions (e.g., float16, float32).
89
93
90
-
We will be using the float32.tflite model for on-device inference.
94
+
You will be using the float32.tflite model for on-device inference.
91
95
92
96
### Convert DiT and AutoEncoder
93
97
94
-
To convert the DiT and AutoEncoder submodules, use the [Generative API](https://github.com/google-ai-edge/ai-edge-torch/tree/main/ai_edge_torch/generative/) provided by the ai-edge-torch tools. This enables you to export a generative PyTorch model directly to tflite using three main steps:
98
+
To convert the DiT and AutoEncoder submodules, use the [Generative API](https://github.com/google-ai-edge/ai-edge-torch/tree/main/ai_edge_torch/generative/) provided by the ai-edge-torch tools. This enables you to export a generative PyTorch model directly to `.tflite` using three main steps:
95
99
96
100
1. Model re-authoring.
97
101
2. Quantization.
98
102
3. Conversion.
99
103
100
-
Convert the DiT and AutoEncoder submodules using the provided python script:
104
+
Convert the DiT and AutoEncoder submodules using the provided Python script:
After successful conversion, you now have `dit_model.tflite` and `autoencoder_model.tflite` models in your current directory.
106
111
107
-
More detailed explanation of the above scripts is available [here](https://github.com/ARM-software/ML-examples/blob/main/kleidiai-examples/audiogen/scripts/README.md)
112
+
A more detailed explanation of the above scripts is available [here](https://github.com/ARM-software/ML-examples/blob/main/kleidiai-examples/audiogen/scripts/README.md)
108
113
109
-
For easier access, we add all needed models to one directory:
114
+
For easy access, add all needed models to one directory:
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/4-building-litert.md
+27-24Lines changed: 27 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ layout: learningpathall
8
8
9
9
## LiteRT
10
10
11
-
LiteRT (short for Lite Runtime), formerly known as TensorFlow Lite, is Google's high-performance runtime for on-device AI.
11
+
LiteRT (short for Lite Runtime), formerly known as TensorFlow Lite, is Google's high-performance runtime for on-device AI. Designed for low-latency, resource-efficient execution, LiteRT is optimized for mobile and embedded environments — making it a natural fit for Arm CPUs running models lite Stable Audio Open Small. You will build the runtime using the framework using the Bazel build tool.
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/5-creating-simple-program.md
+34-8Lines changed: 34 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,15 +8,17 @@ layout: learningpathall
8
8
9
9
## Create and build a simple program
10
10
11
-
You'll now build a simple program that runs inference on all three submodules directly on an Android device.
11
+
As a final step, you'll now build a simple program that runs inference on all three submodules directly on an Android device.
12
12
13
13
The program takes a text prompt as input and generates an audio file as output.
we will save this model in `WORKSPACE` for ease of access
41
+
42
+
Verify this model was downloaded to your `WORKSPACE`.
43
+
38
44
```text
39
-
cp spiece.model $WORKSPACE
45
+
ls $WORKSPACE/spiece.model
40
46
```
41
47
42
-
Now use adb (Android Debug Bridge) to push all necessary files into the `audiogen` folder on Android device:
48
+
Connect your Android device to your development machine using a cable. adb (Android Debug Bridge) is available as part of the Android SDK. You should see your device on running the following command.
49
+
50
+
```bash
51
+
adb devices
52
+
```
53
+
54
+
```output
55
+
<DEVICE ID> device
56
+
```
57
+
58
+
Note that you may have to approve the connection on your phone for this to work. Now, use `adb` to push all necessary files into the `audiogen` folder on Android device:
Start a new shell to access the device's system from your development machine:
72
+
73
+
```bash
56
74
adb shell
75
+
```
76
+
77
+
Finally, run the program on your Android device. Play around with the advice from [Download the model](../2-testing-model) section.
78
+
79
+
```bash
57
80
cd /data/local/tmp/app
58
81
LD_LIBRARY_PATH=. ./audiogen ."warm arpeggios on house beats 120BPM with drums effect" 4
59
82
exit
60
83
```
61
84
62
85
The successful execution of the app will create `output.wav` of your chosen audio defined by the prompt, you can pull it back to your host machine and enjoy!
86
+
63
87
```bash
64
88
adb pull /data/local/tmp/app/output.wav
65
89
```
90
+
91
+
You should now have gained hands-on experience running the Stable Audio Open Small model with LiteRT on Arm-based devices. This includes setting up the environment, optimizing the model for on-device inference, and understanding how efficient runtimes like LiteRT make low-latency generative AI possible at the edge. You’re now better equipped to explore and deploy AI-powered audio applications on mobile and embedded platforms.
Copy file name to clipboardExpand all lines: content/learning-paths/mobile-graphics-and-gaming/run-stable-audio-open-small-with-lite-rt/_index.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ minutes_to_complete: 30
6
6
who_is_this_for: This is an introductory topic for developers looking to deploy the Stable Audio Open Small text-to-audio model using LiteRT on an Android device.
7
7
8
8
learning_objectives:
9
-
- Deploy the Stable Audio Open Small model on Android using LiteRT.
9
+
- Download and learn about the Stable Audio Open Small.
10
10
- Create a simple application to generate audio.
11
11
- Compile the application for an Arm CPU.
12
12
- Run the application on an Android smartphone and generate an audio snippet.
0 commit comments