Skip to content

Commit 3f09fe8

Browse files
authored
Merge branch 'main' into dev1/winskuo/eurobert
2 parents 192d30b + be8ffd1 commit 3f09fe8

File tree

11 files changed

+282
-111
lines changed

11 files changed

+282
-111
lines changed

.github/workflows/android-perf.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -342,8 +342,8 @@ jobs:
342342
git clone https://github.com/huggingface/optimum-executorch
343343
pushd optimum-executorch
344344
# There is no release yet, for CI stability, always test from the same commit on main
345-
git checkout 1c653dc49812fc431a22312c7295d97005d22e12
346-
python install_dev.py
345+
git checkout 4c3b18f6cca68c5ccff809131d570062723d7188
346+
python install_dev.py --skip_override_torch
347347
pip list
348348
349349
ARGS=(

.github/workflows/apple-perf.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -347,8 +347,8 @@ jobs:
347347
git clone https://github.com/huggingface/optimum-executorch
348348
pushd optimum-executorch
349349
# There is no release yet, for CI stability, always test from the same commit on main
350-
git checkout 1c653dc49812fc431a22312c7295d97005d22e12
351-
${CONDA_RUN} python install_dev.py
350+
git checkout 4c3b18f6cca68c5ccff809131d570062723d7188
351+
${CONDA_RUN} python install_dev.py --skip_override_torch
352352
pip list
353353
354354
ARGS=(

.github/workflows/trunk.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -597,9 +597,8 @@ jobs:
597597
git clone https://github.com/huggingface/optimum-executorch
598598
pushd optimum-executorch
599599
# There is no release yet, for CI stability, always test from the same commit on main
600-
git checkout 1c653dc49812fc431a22312c7295d97005d22e12
601-
pip install .[tests]
602-
pip install transformers==4.52.4
600+
git checkout 4c3b18f6cca68c5ccff809131d570062723d7188
601+
python install_dev.py --skip_override_torch
603602
popd
604603
pip list
605604
echo "::endgroup::"

backends/mediatek/README.md

Lines changed: 15 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -14,23 +14,11 @@ The examples provided in this repository are tested and supported on the followi
1414

1515
Before you begin, ensure you have the following prerequisites installed and configured:
1616

17-
#### 1. Buck2 Build Tool
18-
19-
- **Download Buck2**: Obtain Buck2 from the official [releases page](https://github.com/facebook/buck2/releases/tag/2024-02-01).
20-
- **Add to PATH**: Extract the downloaded file and add the directory to your system's `$PATH` environment variable.
21-
```bash
22-
export PATH=<path_to_buck>:$PATH
23-
```
24-
25-
#### 2. Android NDK
17+
#### 1. Android NDK
2618

2719
- **Download Android NDK**: Acquire the Android NDK version 26.3.11579264 from the [Android developer site](https://developer.android.com/ndk/downloads).
28-
- **Set NDK Path**: Ensure that the `$ANDROID_NDK` environment variable is set to the path where the NDK is located.
29-
```bash
30-
export ANDROID_NDK=<path_to_android_ndk>
31-
```
3220

33-
#### 3. MediaTek ExecuTorch Libraries
21+
#### 2. MediaTek ExecuTorch Libraries
3422

3523
To get started with MediaTek's ExecuTorch libraries, download the [NeuroPilot Express SDK](https://neuropilot.mediatek.com/resources/public/npexpress/en/docs/npexpress) from MediaTek's NeuroPilot portal. The SDK includes the following components:
3624

@@ -60,26 +48,28 @@ Follow the steps below to setup your build environment:
6048
pip3 install mtk_neuron-8.2.19-py3-none-linux_x86_64.whl
6149
pip3 install mtk_converter-8.13.0+public-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
6250
```
63-
- Set evironment variables for building backend
64-
```bash
65-
export NEURON_BUFFER_ALLOCATOR_LIB=<path_to_buffer_allocator>
66-
```
6751

6852
### Build
69-
1. Navigate to `scripts/` directory.
70-
71-
2. **Build MediaTek Backend**: Once the prerequisites are in place, run the `mtk_build.sh` script to start the build process, MediaTek backend will be built under `cmake-android-out/backends/` as `libneuron_backend.so`
53+
1. Copy `NeuronAdapter.h` to `backends/mediatek/runtime/include/api/`
7254

55+
2. Set NDK Path: Ensure that the `$ANDROID_NDK` environment variable is set to the path where the NDK is located.
7356
```bash
74-
./mtk_build.sh
57+
export ANDROID_NDK=<path_to_android_ndk>
7558
```
7659

77-
### Run
60+
3. Build the backend library `libneuron_backend.so`:
61+
```bash
62+
cd backends/mediatek/scripts/
63+
./mtk_build.sh
64+
```
65+
The output is `libneuron_backend.so` in `cmake-android-out/backends/mediatek/`.
7866

79-
1. **Push MediaTek universal SDK and MediaTek backend to the device**: push `libneuronusdk_adapter.mtk.so` and `libneuron_backend.so` to the phone and export it to the `$LD_LIBRARY_PATH` environment variable before executing ExecuTorch with MediaTek backend.
67+
### Run
8068

69+
1. Push `libneuron_backend.so`, `libneuronusdk_adapter.mtk.so` and `libneuron_buffer_allocator.so` to the device.
70+
2. Set the library path before running ExecuTorch:
8171
```bash
82-
export LD_LIBRARY_PATH=<path_to_usdk>:<path_to_neuron_backend>:$LD_LIBRARY_PATH
72+
export LD_LIBRARY_PATH=<path_to_neuron_backend>:<path_to_usdk>:<path_to_buffer_allocator>:$LD_LIBRARY_PATH
8373
```
8474

8575
Please refer to `executorch/examples/mediatek/` for export and execution examples of various of models.

devtools/inspector/_inspector_utils.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -690,3 +690,34 @@ def map_runtime_aot_intermediate_outputs(
690690
)
691691

692692
return aot_runtime_mapping
693+
694+
695+
def convert_to_float_tensor(input_data: Any) -> torch.Tensor:
696+
"""
697+
Convert input_data into a torch.Tensor on CPU with dtype torch.float64.
698+
This function handles the following types of input:
699+
- Scalar (int or float): Converts to a tensor with a single element.
700+
- Tensor: Converts to a float64 tensor on CPU.
701+
- List of Tensors: Stacks the tensors into a single float64 tensor on CPU.
702+
The resulting tensor is detached, moved to CPU, and cast to torch.float64.
703+
Parameters:
704+
input_data (Any): The input data to be converted to a tensor. It can be a scalar,
705+
a tensor, or a list of tensors.
706+
Returns:
707+
torch.Tensor: A tensor on CPU with dtype torch.float64.
708+
Raises:
709+
ValueError: If the input_data cannot be converted to a tensor.
710+
"""
711+
try:
712+
# Check if the input is a list of tensors
713+
if isinstance(input_data, list):
714+
input_tensor = torch.stack([convert_to_float_tensor(a) for a in input_data])
715+
# Try to convert the input to a tensor
716+
else:
717+
input_tensor = torch.as_tensor(input_data, dtype=torch.float64)
718+
except Exception as e:
719+
raise ValueError(
720+
f"Cannot convert value of type {type(input_data)} to a tensor: {e}"
721+
)
722+
input_tensor = input_tensor.detach().cpu().double()
723+
return input_tensor

devtools/inspector/tests/inspector_utils_test.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
calculate_mse,
3030
calculate_snr,
3131
calculate_time_scale_factor,
32+
convert_to_float_tensor,
3233
create_debug_handle_to_op_node_mapping,
3334
EDGE_DIALECT_GRAPH_KEY,
3435
find_populated_event,
@@ -317,6 +318,52 @@ def test_map_runtime_aot_intermediate_outputs_complex_chain(self):
317318
expected = {((1, 2, 3, 4, 5, 6), 300): ((2, 3, 4, 5, 6, 7), 350)}
318319
self.assertEqual(actual, expected)
319320

321+
def test_convert_input_to_tensor_convertible_inputs(self):
322+
# Scalar -> tensor
323+
actual_output1 = convert_to_float_tensor(5)
324+
self.assertIsInstance(actual_output1, torch.Tensor)
325+
self.assertEqual(actual_output1.dtype, torch.float64)
326+
self.assertEqual(tuple(actual_output1.shape), ())
327+
self.assertTrue(
328+
torch.allclose(actual_output1, torch.tensor([5.0], dtype=torch.float64))
329+
)
330+
self.assertEqual(actual_output1.device.type, "cpu")
331+
332+
# Tensor of ints -> float32 CPU
333+
t_int = torch.tensor([4, 5, 6], dtype=torch.int32)
334+
actual_output2 = convert_to_float_tensor(t_int)
335+
self.assertIsInstance(actual_output2, torch.Tensor)
336+
self.assertEqual(actual_output2.dtype, torch.float64)
337+
self.assertTrue(
338+
torch.allclose(
339+
actual_output2, torch.tensor([4.0, 5.0, 6.0], dtype=torch.float64)
340+
)
341+
)
342+
self.assertEqual(actual_output2.device.type, "cpu")
343+
344+
# List of tensors -> stacked tensor float32 CPU
345+
t_list = [torch.tensor([1, 2]), torch.tensor([2, 3]), torch.tensor([3, 4])]
346+
actual_output3 = convert_to_float_tensor(t_list)
347+
self.assertIsInstance(actual_output3, torch.Tensor)
348+
self.assertEqual(actual_output3.dtype, torch.float64)
349+
self.assertEqual(tuple(actual_output3.shape), (3, 2))
350+
self.assertTrue(
351+
torch.allclose(
352+
actual_output3,
353+
torch.tensor([[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]], dtype=torch.float64),
354+
)
355+
)
356+
self.assertEqual(actual_output3.device.type, "cpu")
357+
358+
def test_convert_input_to_tensor_non_convertible_raises(self):
359+
class X:
360+
pass
361+
362+
with self.assertRaises(ValueError) as cm:
363+
convert_to_float_tensor(X())
364+
msg = str(cm.exception)
365+
self.assertIn("Cannot convert value of type", msg)
366+
320367

321368
def gen_mock_operator_graph_with_expected_map() -> (
322369
Tuple[OperatorGraph, Dict[int, OperatorNode]]

docs/source/backends-mediatek.md

Lines changed: 49 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -1,95 +1,79 @@
11
# MediaTek Backend
22

3-
MediaTek backend empowers ExecuTorch to speed up PyTorch models on edge devices that equips with MediaTek Neuron Processing Unit (NPU). This document offers a step-by-step guide to set up the build environment for the MediaTek ExecuTorch libraries.
4-
5-
::::{grid} 2
6-
:::{grid-item-card} What you will learn in this tutorial:
7-
:class-card: card-prerequisites
8-
* How to export and lower a PyTorch model ahead of time with ExecuTorch for MediaTek devices.
9-
* How to build MediaTek backend and examples.
10-
* How to deploy the exported models on device with ExecuTorch runtime.
11-
:::
12-
:::{grid-item-card} Tutorials we recommend you complete before this:
13-
:class-card: card-prerequisites
14-
* [Introduction to ExecuTorch](intro-how-it-works.md)
15-
* [Getting Started](getting-started.md)
16-
* [Building ExecuTorch with CMake](using-executorch-building-from-source.md)
17-
:::
18-
::::
19-
20-
21-
## Prerequisites (Hardware and Software)
22-
23-
### Host OS
24-
- Linux operating system
25-
26-
### Supported Chips:
27-
- MediaTek Dimensity 9300 (D9300)
28-
- MediaTek Dimensity 9400 (D9400)
3+
The MediaTek backend enables acceleration of PyTorch models on edge devices with MediaTek Neuron Processing Units (NPUs). This backend provides tools for exporting, building, and deploying models to leverage MediaTek hardware.
294

30-
### Software:
5+
## Features
316

32-
- [NeuroPilot Express SDK](https://neuropilot.mediatek.com/resources/public/npexpress/en/docs/npexpress) is a lightweight SDK for deploying AI applications on MediaTek SOC devices.
7+
- Acceleration of PyTorch models on MediaTek NPUs
8+
- Tools for model export and lowering
9+
- Example scripts for model deployment and execution
3310

34-
## Setting up your developer environment
11+
## Target Requirements
3512

36-
Follow the steps below to setup your build environment:
13+
- **Hardware:** MediaTek Dimensity 9300 (D9300), Dimensity 9400 (D9400)
14+
- **Host OS:** Linux
15+
- **SDK:** [NeuroPilot Express SDK](https://neuropilot.mediatek.com/resources/public/npexpress/en/docs/npexpress)
3716

38-
1. **Setup ExecuTorch Environment**: Refer to the [Getting Started](getting-started.md) guide for detailed instructions on setting up the ExecuTorch environment.
17+
## Development Requirements
3918

40-
2. **Setup MediaTek Backend Environment**
41-
```bash
42-
pip3 install -r requirements.txt
43-
```
44-
- Install the two .whl downloaded from NeuroPilot Portal
45-
```bash
46-
pip3 install mtk_neuron-8.2.19-py3-none-linux_x86_64.whl
47-
pip3 install mtk_converter-8.13.0+public-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
48-
```
49-
- Set evironment variables for building backend
50-
```bash
51-
export NEURON_BUFFER_ALLOCATOR_LIB=<path_to_buffer_allocator>
52-
```
53-
Additionally, make sure to copy `NeuronAdapter.h` to the following directory: `backends/mediatek/runtime/include/api/`.
19+
- Linux operating system
20+
- Python dependencies:
21+
```bash
22+
pip3 install -r requirements.txt
23+
```
24+
- NeuroPilot SDK Python wheels (download from [NeuroPilot Express SDK](https://neuropilot.mediatek.com/resources/public/npexpress/en/docs/npexpress)):
25+
```bash
26+
pip3 install mtk_neuron-8.2.19-py3-none-linux_x86_64.whl
27+
pip3 install mtk_converter-8.13.0+public-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
28+
```
5429

55-
## Build
30+
## Using the MediaTek Backend
5631

57-
### Ahead of time:
32+
### Exporting and Lowering a Model
5833

59-
**Exporting a PyTorch Model for MediaTek Backend**:
60-
1. Lower and export the `.pte` file for on-device execution. The export script samples are povided under `example/mediatek/`. For example, the following commnad exports the `.pte` using the scripts provided.
34+
To export and lower a model for the MediaTek backend, use the provided shell script:
6135
```bash
6236
cd executorch
63-
6437
./examples/mediatek/shell_scripts/export_oss.sh mobilenetv3
6538
```
39+
The exported `.pte` file is saved in a directory named after the model.
6640

67-
2. Find the `.pte` files under the directory named as same as the model.
41+
### Partitioner API
6842

69-
### Runtime:
43+
A list of CompileSpec is suppported by MediaTek backend:
44+
- `platform-config`: Specifies the targeted MediaTek platform name to compile for.
7045

71-
**Build MediaTek Backend for ExecuTorch Runtime**
72-
1. Navigate to `backends/mediatek/scripts/` directory.
46+
## Runtime Integration
7347

74-
2. **Build MediaTek Backend**: Once the prerequisites are in place, run the `mtk_build.sh` script to start the build process:
75-
```bash
76-
./mtk_build.sh
77-
```
48+
This section presents an example of exporting and deploying a model. Please refer to `executorch/examples/mediatek/` for export and execution examples of various of models.
7849

79-
3. MediaTek backend will be built under `cmake-android-out/backends/` as `libneuron_backend.so`.
50+
### Building Example Runners
8051

81-
**Build a runner to execute the model on the device**:
82-
1. Build the runners and the backend by exedcuting the script:
52+
Build example runners:
8353
```bash
8454
./mtk_build_examples.sh
8555
```
56+
Runners are located in `cmake-android-out/examples/mediatek/`.
8657

87-
2. The runners will be built under `cmake-android-out/examples/`
58+
### Deploying to Device
8859

89-
## Deploying and running on a device
60+
1. Push `libneuron_backend.so`, `libneuronusdk_adapter.mtk.so` and `libneuron_buffer_allocator.so` to the device.
61+
2. Set the library path before running ExecuTorch:
62+
```bash
63+
export LD_LIBRARY_PATH=<path_to_neuron_backend>:<path_to_usdk>:<path_to_buffer_allocator>:$LD_LIBRARY_PATH
64+
```
9065

91-
1. **Push MediaTek universal SDK and MediaTek backend to the device**: push `libneuronusdk_adapter.mtk.so` and `libneuron_backend.so` to the phone and export it to the `$LD_LIBRARY_PATH` environment variable before executing ExecuTorch with MediaTek backend.
66+
### Building the Backend from Source
67+
1. Copy `NeuronAdapter.h` to `backends/mediatek/runtime/include/api/`
9268

69+
2. Set NDK Path: Ensure that the `$ANDROID_NDK` environment variable is set to the path where the NDK is located.
9370
```bash
94-
export LD_LIBRARY_PATH=<path_to_usdk>:<path_to_neuron_backend>:$LD_LIBRARY_PATH
71+
export ANDROID_NDK=<path_to_android_ndk>
9572
```
73+
74+
3. Build the backend library `libneuron_backend.so`:
75+
```bash
76+
cd backends/mediatek/scripts/
77+
./mtk_build.sh
78+
```
79+
The output is `libneuron_backend.so` in `cmake-android-out/backends/mediatek/`.

extension/llm/runner/irunner.h

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,23 @@ class ET_EXPERIMENTAL IRunner {
121121
std::function<void(const std::string&)> token_callback,
122122
std::function<void(const Stats&)> stats_callback) = 0;
123123

124+
/**
125+
* Generate text based on the provided prompt and generation config, from a
126+
* given position in KV cache.
127+
*
128+
* @param prompt The input prompt to generate from
129+
* @param start_pos The starting position in KV cache of the input
130+
* @param config Generation configuration parameters
131+
* @param token_callback Callback function called for each generated token
132+
* @param stats_callback Callback function for generation statistics
133+
* @return Error::Ok if successful, an error otherwise
134+
*/
135+
virtual runtime::Error generate_from_pos(
136+
const std::string& prompt,
137+
int64_t start_pos,
138+
const GenerationConfig& config,
139+
std::function<void(const std::string&)> token_callback,
140+
std::function<void(const Stats&)> stats_callback) = 0;
124141
/**
125142
* Stop the generation process.
126143
*/

0 commit comments

Comments
 (0)