Skip to content

Commit 06b35ed

Browse files
authored
Merge pull request #1961 from HenryDen/main
Update the Vision llm
2 parents 2f12f8a + 4cbb367 commit 06b35ed

File tree

3 files changed

+54
-96
lines changed

3 files changed

+54
-96
lines changed

content/learning-paths/mobile-graphics-and-gaming/vision-llm-inference-on-android-with-kleidiai-and-mnn/1-devenv-and-model.md

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Build the MNN Android Demo with GUI
2+
title: Environment setup and prepare model
33
weight: 3
44

55
### FIXED, DO NOT MODIFY
@@ -9,7 +9,7 @@ layout: learningpathall
99

1010
In this section, you'll set up your development environment by installing dependencies and preparing the Qwen vision model.
1111

12-
Install the Android NDK (Native Development Kit) and git-lfs. This Learning Path was tested with NDK version `28.0.12916984` and CMake version `3.31.6`.
12+
Install the Android NDK (Native Development Kit) and git-lfs. This Learning Path was tested with NDK version `28.0.12916984` and CMake version `4.0.0-rc1`.
1313

1414
For Ubuntu or Debian systems, install CMake and git-lfs with the following commands:
1515

@@ -18,9 +18,9 @@ sudo apt update
1818
sudo apt install cmake git-lfs -y
1919
```
2020

21-
You can use Android Studio to obtain the NDK.
21+
You can use Android Studio to obtain the NDK.
2222

23-
Click **Tools > SDK Manager** and navigate to the **SDK Tools** tab.
23+
Click **Tools > SDK Manager** and navigate to the **SDK Tools** tab.
2424

2525
Select the **NDK (Side by side)** and **CMake** checkboxes, as shown below:
2626

@@ -48,7 +48,7 @@ If Python 3.x is not the default version, try running `python3 --version` and `p
4848

4949
## Set up Phone Connection
5050

51-
You need to set up an authorized connection with your phone. The Android SDK Platform Tools package, included with Android Studio, provides Android Debug Bridge (ADB) for transferring files.
51+
You need to set up an authorized connection with your phone. The Android SDK Platform Tools package, included with Android Studio, provides Android Debug Bridge (ADB) for transferring files.
5252

5353
Connect your phone to your computer using a USB cable, and enable USB debugging on your phone. To do this, tap the **Build Number** in your **Settings** app 7 times, then enable **USB debugging** in **Developer Options**.
5454

@@ -65,9 +65,18 @@ List of devices attached
6565
<DEVICE ID> device
6666
```
6767

68-
## Download and Convert the Model
68+
## Download the quantized Model
6969

70-
The following commands download the model from Hugging Face, and clone a tool for exporting the LLM model to the MNN framework.
70+
The pre-quantized model is available in Hugging Face, you can download with the following command:
71+
72+
```bash
73+
git lfs install
74+
git clone https://huggingface.co/taobao-mnn/Qwen2.5-VL-3B-Instruct-MNN
75+
git checkout 9057334b3f85a7f106826c2fa8e57c1aee727b53
76+
```
77+
78+
## (Optional) Download and Convert the Model
79+
If you need to quantize the model with customized parameter, the following commands download the model from Hugging Face, and clone a tool for exporting the LLM model to the MNN framework.
7180

7281
```bash
7382
cd $HOME
@@ -95,11 +104,13 @@ To learn more about the parameters, see the [transformers README.md](https://git
95104

96105
Verify that the model was built correctly by checking that the `Qwen2-VL-2B-Instruct-convert-4bit-per_channel` directory is at least 1 GB in size.
97106

107+
## Push the model to Android device
108+
98109
Push the model onto the device:
99110

100111
```shell
101112
adb shell mkdir /data/local/tmp/models/
102-
adb push Qwen2-VL-2B-Instruct-convert-4bit-per_channel /data/local/tmp/models
113+
adb push Qwen2.5-VL-3B-Instruct-MNN /data/local/tmp/models
103114
```
104115

105-
With the model set up, you're ready to use Android Studio to build and run an example application.
116+
With the model set up, you're ready to build and run an example application.
Lines changed: 34 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
---
22
title: Build the MNN Command-line ViT Demo
3-
weight: 5
3+
weight: 4
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88
## Prepare an Example Image
99

10-
In this section, you'll benchmark model performance with and without KleidiAI kernels. To run optimized inference, you'll first need to compile the required library files. You'll also need an example image to run command-line prompts.
10+
In this section, you'll benchmark model performance with and without KleidiAI kernels. To run optimized inference, you'll first need to compile the required library files. You'll also need an example image to run command-line prompts.
1111

12-
You can use the provided image of the tiger below that this Learning Path uses, or choose your own.
12+
You can use the provided image of the tiger below that this Learning Path uses, or choose your own.
1313

1414
Whichever you select, rename the image to `example.png` to use the commands in the following sections.
1515

@@ -23,24 +23,30 @@ adb push example.png /data/local/tmp/
2323

2424
## Build Binaries for Command-line Inference
2525

26-
Navigate to the Vision Language Models project that you cloned in the previous section.
26+
Run the following commands to clone the MNN repository and checkout the source tree:
27+
28+
```bash
29+
cd $HOME
30+
git clone https://github.com/alibaba/MNN.git
31+
cd MNN
32+
git checkout 282cebeb785118865b9c903decc4b5cd98d5025e
33+
```
34+
35+
Create a build directory and run the build script.
2736

2837
The first time that you do this, build the binaries with the `-DMNN_KLEIDIAI` flag set to `FALSE`.
2938

3039
```bash
31-
cmake ./vit/ -B build \
32-
-DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
33-
-DCMAKE_BUILD_TYPE=Release \
34-
-DANDROID_ABI="arm64-v8a" \
35-
-DANDROID_STL=c++_static \
36-
-DANDROID_NATIVE_API_LEVEL=android-21 \
37-
-DMNN_BUILD_OPENCV=true \
38-
-DMNN_IMGCODECS=true \
39-
-DMNN_KLEIDIAI=false
40-
cmake --build ./build
40+
cd $HOME/MNN/project/android
41+
mkdir build_64 && cd build_64
42+
43+
../build_64.sh "-DMNN_LOW_MEMORY=true -DLLM_SUPPORT_VISION=true -DMNN_KLEIDIAI=FALSE \
44+
-DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true \
45+
-DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_OPENCL=true \
46+
-DMNN_USE_LOGCAT=true -DMNN_IMGCODECS=true -DMNN_BUILD_OPENCV=true"
4147
```
4248
{{% notice Note %}}
43-
If your NDK toolchain isn't set up correctly, you might run into issues with the above script. Make a note of where the NDK was installed - this will be a directory named after the version you downloaded earlier. Try exporting the following environment variables before re-running above commands:
49+
If your NDK toolchain isn't set up correctly, you might run into issues with the above script. Make a note of where the NDK was installed - this will be a directory named after the version you downloaded earlier. Try exporting the following environment variables before re-running `build_64.sh`:
4450

4551
```bash
4652
export ANDROID_NDK_HOME=<path-to>/ndk/28.0.12916984
@@ -55,23 +61,23 @@ export ANDROID_NDK=$ANDROID_NDK_HOME
5561
Push the required files to your Android device, then enter a shell on the device using ADB:
5662

5763
```bash
58-
adb push build/bin/vision_llm build/lib/*.so /data/local/tmp
64+
adb push *so llm_demo tools/cv/*so /data/local/tmp/
5965
adb shell
6066
```
6167

6268
Run the following commands in the ADB shell. Navigate to the directory you pushed the files to, add executable permissions to the `llm_demo` file and export an environment variable for it to run properly. After this, use the example image you transferred earlier to create a file containing the text content for the prompt.
6369

6470
```bash
6571
cd /data/local/tmp/
66-
chmod +x vision_llm
72+
chmod +x llm_demo
6773
export LD_LIBRARY_PATH=$PWD
6874
echo "<img>./example.png</img>Describe the content of the image." > prompt
6975
```
7076

7177
Finally, run an inference on the model with the following command:
7278

7379
```bash
74-
./vision_llm models/Qwen-VL-2B-convert-4bit-per_channel/config.json prompt
80+
./llm_demo models/Qwen2.5-VL-3B-Instruct-MNN/config.json prompt
7581
```
7682

7783
If the launch is successful, you should see the following output, with the performance benchmark at the end:
@@ -96,28 +102,22 @@ prefill speed = 192.28 tok/s
96102

97103
## Enable KleidiAI and Re-run Inference
98104

99-
The next step is to re-generate the binaries with KleidiAI activated. This is done by updating the flag `-DMNN_KLEIDIAI` to `TRUE`.
105+
The next step is to re-generate the binaries with KleidiAI activated. This is done by updating the flag `-DMNN_KLEIDIAI` to `TRUE`.
100106

101-
From the `build` directory, run:
107+
From the `build_64` directory, run:
102108
```bash
103-
cmake ./vit/ -B build \
104-
-DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
105-
-DCMAKE_BUILD_TYPE=Release \
106-
-DANDROID_ABI="arm64-v8a" \
107-
-DANDROID_STL=c++_static \
108-
-DANDROID_NATIVE_API_LEVEL=android-21 \
109-
-DMNN_BUILD_OPENCV=true \
110-
-DMNN_IMGCODECS=true \
111-
-DMNN_KLEIDIAI=false
112-
cmake --build ./build
109+
../build_64.sh "-DMNN_LOW_MEMORY=true -DLLM_SUPPORT_VISION=true -DMNN_KLEIDIAI=TRUE \
110+
-DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true \
111+
-DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_OPENCL=true \
112+
-DMNN_USE_LOGCAT=true -DMNN_IMGCODECS=true -DMNN_BUILD_OPENCV=true"
113113
```
114114
## Update Files on the Device
115115

116116
First, remove existing binaries from your Android device, then push the updated files:
117117

118118
```bash
119-
adb shell "cd /data/local/tmp; rm -rf *so vision_llm"
120-
adb push build/bin/vision_llm build/lib/*.so /data/local/tmp
119+
adb shell "cd /data/local/tmp; rm -rf *so llm_demo tools/cv/*so"
120+
adb push *so llm_demo tools/cv/*so /data/local/tmp/
121121
adb shell
122122
```
123123

@@ -127,7 +127,7 @@ With the new ADB shell, run the following commands:
127127
cd /data/local/tmp/
128128
chmod +x llm_demo
129129
export LD_LIBRARY_PATH=$PWD
130-
./llm_demo models/Qwen-VL-2B-convert-4bit-per_channel/config.json prompt
130+
./llm_demo models/Qwen2.5-VL-3B-Instruct-MNN/config.json prompt
131131
```
132132
## Benchmark Results
133133

@@ -154,7 +154,7 @@ This time, you should see an improvement in the benchmark. Below is an example t
154154
| Prefill Speed | 192.28 tok/s | 266.13 tok/s |
155155
| Decode Speed | 34.73 tok/s | 44.96 tok/s |
156156

157-
**Prefill speed** describes how fast the model processes the input prompt.
157+
**Prefill speed** describes how fast the model processes the input prompt.
158158

159159
**Decode Speed** indicates how quickly the model generates new tokens after the input is processed.
160160

content/learning-paths/mobile-graphics-and-gaming/vision-llm-inference-on-android-with-kleidiai-and-mnn/2-generate-apk.md

Lines changed: 0 additions & 53 deletions
This file was deleted.

0 commit comments

Comments
 (0)