You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -104,24 +95,27 @@ Note: Exporting model flow can take 2.5 hours (114GB RAM for num_chunks=4) to co
104
95
105
96
Before continuing forward, make sure to modify the tokenizer, token embedding, and model paths in the examples/mediatek/executor_runner/run_llama3_sample.sh.
106
97
107
-
## Deploy Files on Device
98
+
## Populate Model Paths in Runner
108
99
109
-
### Prepare to Deploy
110
-
Prior to deploying the files on device, make sure to modify the tokenizer, token embedding, and model file names in examples/mediatek/executor_runner/run_llama3_sample.sh reflect what was generated during the Export Llama Model step.
100
+
### Populate Model Paths in Runner
101
+
The Mediatek runner (`examples/mediatek/executor_runner/mtk_llama_runner.cpp`)) contains the logic for implementing the function calls that come from the Android app.
**Important**: Currently the model paths are set in the runner-level. Modify the values in `examples/mediatek/executor_runner/llama_runner/llm_helper/include/llama_runner_values.h` to set the model paths, tokenizer path, embedding file path, and other metadata.
115
104
116
-
In addition, create a sample_prompt.txt file with a prompt. This will be deployed to the device in the next step.
You are a helpful AI assistant for travel tips and recommendations<|eot_id|><|start_header_id|>user<|end_header_id|>
106
+
## Build AAR Library
122
107
123
-
What can you help me with?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
108
+
Next we need to build and compile the MediaTek backend and MediaTek Llama runner.
124
109
```
110
+
sh build/build_android_llm_demo.sh
111
+
```
112
+
113
+
**Output**: This will generate a .aar file is already imported into the expected directory for the Android app. It lives in `examples/demo-apps/android/Llamademo/app/libs`.
114
+
115
+
If you were to unzip the .aar file or open it in Android Studio, you can see that it contains the following related to MediaTek backend:
116
+
* libneuron_buffer_allocator.so
117
+
* libneuronusdk_adapter.mtk.so
118
+
* libneuron_backend.so (generated during build)
125
119
126
120
### Deploy
127
121
First, make sure your Android phone’s chipset version is compatible with this demo (MediaTek Dimensity 9300 (D9300)) chip. Once you have the model, tokenizer, and runner generated ready, you can push them and the .so files to the device before we start running using the runner via shell.
3. Click the "Load Model" button. This will load the models from the Runner
159
+
154
160
## Reporting Issues
155
161
If you encountered any bugs or issues following this tutorial please file a bug/issue here on [Github](https://github.com/pytorch/executorch/issues/new).
0 commit comments