Skip to content

Commit 0599351

Browse files
authored
wasi-nn: Add a new target for llama.cpp as a wasi-nn backend (#3709)
Minimum support: - [x] accept (WasmEdge) customized model parameters. metadata. - [x] Target [wasmedge-ggml examples](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml) - [x] basic - [x] chatml - [x] gemma - [x] llama - [x] qwen --- In the future, to support if required: - [ ] Target [wasmedge-ggml examples](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml) - [ ] command-r. (>70G memory requirement) - [ ] embedding. (embedding mode) - [ ] grammar. (use the grammar option to constrain the model to generate the JSON output) - [ ] llama-stream. (new APIS `compute_single`, `get_output_single`, `fini_single`) - [ ] llava. (image representation) - [ ] llava-base64-stream. (image representation) - [ ] multimodel. (image representation) - [ ] Target [llamaedge](https://github.com/LlamaEdge/LlamaEdge)
1 parent cb71ca5 commit 0599351

File tree

11 files changed

+947
-120
lines changed

11 files changed

+947
-120
lines changed

build-scripts/config_common.cmake

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -442,7 +442,9 @@ if (WAMR_BUILD_WASI_NN EQUAL 1)
442442
message (" WASI-NN enabled")
443443
add_definitions (-DWASM_ENABLE_WASI_NN=1)
444444
# Variant backends
445-
if (NOT WAMR_BUILD_WASI_NN_TFLITE EQUAL 1 AND NOT WAMR_BUILD_WASI_NN_OPENVINO EQUAL 1)
445+
if (NOT WAMR_BUILD_WASI_NN_TFLITE EQUAL 1 AND
446+
NOT WAMR_BUILD_WASI_NN_OPENVINO EQUAL 1 AND
447+
NOT WAMR_BUILD_WASI_NN_LLAMACPP EQUAL 1)
446448
message (FATAL_ERROR " Need to select a backend for WASI-NN")
447449
endif ()
448450

@@ -454,6 +456,10 @@ if (WAMR_BUILD_WASI_NN EQUAL 1)
454456
message (" WASI-NN: backend openvino enabled")
455457
add_definitions (-DWASM_ENABLE_WASI_NN_OPENVINO)
456458
endif ()
459+
if (WAMR_BUILD_WASI_NN_LLAMACPP EQUAL 1)
460+
message (" WASI-NN: backend llamacpp enabled")
461+
add_definitions (-DWASM_ENABLE_WASI_NN_LLAMACPP)
462+
endif ()
457463
# Variant devices
458464
if (WAMR_BUILD_WASI_NN_ENABLE_GPU EQUAL 1)
459465
message (" WASI-NN: GPU enabled")

core/iwasm/libraries/wasi-nn/README.md

Lines changed: 22 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
### Host
66

7-
Enable WASI-NN in the WAMR by spefiying it in the cmake building configuration as follows,
7+
Enable WASI-NN in the WAMR by specifying it in the cmake building configuration as follows,
88

99
```cmake
1010
set (WAMR_BUILD_WASI_NN 1)
@@ -17,14 +17,15 @@ $ cmake -DWAMR_BUILD_WASI_NN=1 <other options> ...
1717
```
1818

1919
> ![Caution]
20-
> If enable `WAMR_BUID_WASI_NN`, iwasm will link a shared WAMR library instead of a static one. Wasi-nn backends will be loaded dynamically at runtime. Users shall specify the path of the backend library and register it to the iwasm runtime with `--native-lib=<path of backend library>`. All shared libraries should be placed in the `LD_LIBRARY_PATH`.
20+
> Enabling WAMR_BUILD_WASI_NN will cause the IWASM to link to a shared WAMR library instead of a static one. The WASI-NN backends will then be loaded dynamically when the program is run. You must ensure that all shared libraries are included in the `LD_LIBRARY_PATH`.
2121
2222
#### Compilation options
2323

24-
- `WAMR_BUILD_WASI_NN`. enable wasi-nn support. can't work alone. need to identify a backend. Match legacy wasi-nn spec naming convention. use `wasi_nn` as import module names.
25-
- `WAMR_BUILD_WASI_EPHEMERAL_NN`. Match latest wasi-nn spec naming convention. use `wasi_ephemeral_nn` as import module names.
26-
- `WAMR_BUILD_WASI_NN_TFLITE`. identify the backend as TensorFlow Lite.
27-
- `WAMR_BUILD_WASI_NN_OPENVINO`. identify the backend as OpenVINO.
24+
- `WAMR_BUILD_WASI_NN`. This option enables support for WASI-NN. It cannot function independently and requires specifying a backend. It follows the original WASI-NN specification for naming conventions and uses wasi_nn for import module names.
25+
- `WAMR_BUILD_WASI_EPHEMERAL_NN`. This option adheres to the most recent WASI-NN specification for naming conventions and uses wasi_ephemeral_nn for import module names.
26+
- `WAMR_BUILD_WASI_NN_TFLITE`. This option designates TensorFlow Lite as the backend.
27+
- `WAMR_BUILD_WASI_NN_OPENVINO`. This option designates OpenVINO as the backend.
28+
- `WAMR_BUILD_WASI_NN_LLAMACPP`. This option designates Llama.cpp as the backend.
2829

2930
### Wasm
3031

@@ -44,7 +45,7 @@ typedef enum { fp16 = 0, fp32, up8, ip32 } tensor_type;
4445

4546
It is required to recompile the Wasm application if you want to switch between the two sets of functions.
4647

47-
#### Openvino
48+
#### Openvino installation
4849

4950
If you're planning to use OpenVINO backends, the first step is to install OpenVINO on your computer. To do this correctly, please follow the official installation guide which you can find at this link: https://docs.openvino.ai/2024/get-started/install-openvino/install-openvino-archive-linux.html.
5051

@@ -162,17 +163,9 @@ Supported:
162163

163164
### Testing with WasmEdge-WASINN Examples
164165

165-
To ensure everything is set up correctly, use the examples from [WasmEdge-WASINN-examples](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master). These examples help verify that WASI-NN support in WAMR is functioning as expected.
166+
To make sure everything is configured properly, refer to the examples provided at [WasmEdge-WASINN-examples](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master). These examples are useful for confirming that the WASI-NN support in WAMR is working correctly.
166167

167-
> Note: The repository contains two types of examples. Some use the [standard wasi-nn](https://github.com/WebAssembly/wasi-nn), while others use [WasmEdge's version of wasi-nn](https://github.com/second-state/wasmedge-wasi-nn), which is enhanced to meet specific customer needs.
168-
169-
The examples test the following machine learning backends:
170-
171-
- OpenVINO
172-
- PyTorch
173-
- TensorFlow Lite
174-
175-
Due to the different requirements of each backend, we'll use a Docker container for a hassle-free testing environment.
168+
Because each backend has its own set of requirements, we recommend using a Docker container to create a straightforward testing environment without complications.
176169

177170
#### Prepare the execution environment
178171

@@ -186,9 +179,20 @@ $ docker build -t wasi-nn-smoke:v1.0 -f ./core/iwasm/libraries/wasi-nn/test/Dock
186179
#### Execute
187180

188181
```bash
182+
$ pwd
183+
/workspaces/wasm-micro-runtime/
189184
$ docker run --rm wasi-nn-smoke:v1.0
190185
```
191186

192-
### Testing with bytecodealliance wasi-nn
187+
It should be noted that the qwen example is selected as the default one about the Llama.cpp backend because it uses a small model and is easy to run.
188+
189+
```bash
190+
- openvino_mobile_image. PASS
191+
- openvino_mobile_raw. PASS
192+
- openvino_road_segmentation_adas. PASS
193+
- wasmedge_ggml_qwen. PASS
194+
```
195+
196+
### Testing with bytecodealliance WASI-NN
193197

194198
For another example, check out [classification-example](https://github.com/bytecodealliance/wasi-nn/tree/main/rust/examples/classification-example), which focuses on OpenVINO. You can run it using the same Docker container mentioned above.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Copyright (C) 2019 Intel Corporation. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
3+
4+
include(FetchContent)
5+
6+
set(CJSON_SOURCE_DIR "${WAMR_ROOT_DIR}/core/deps/cjson")
7+
8+
FetchContent_Declare(
9+
cjson
10+
GIT_REPOSITORY https://github.com/DaveGamble/cJSON.git
11+
GIT_TAG v1.7.18
12+
SOURCE_DIR ${CJSON_SOURCE_DIR}
13+
)
14+
15+
set(ENABLE_CJSON_TEST OFF CACHE INTERNAL "Turn off tests")
16+
set(ENABLE_CJSON_UNINSTALL OFF CACHE INTERNAL "Turn off uninstall to avoid targets conflict")
17+
FetchContent_MakeAvailable(cjson)
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Copyright (C) 2019 Intel Corporation. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
3+
4+
include(FetchContent)
5+
6+
set(LLAMA_SOURCE_DIR "${WAMR_ROOT_DIR}/core/deps/llama.cpp")
7+
8+
FetchContent_Declare(
9+
llamacpp
10+
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
11+
GIT_TAG b3573
12+
SOURCE_DIR ${LLAMA_SOURCE_DIR}
13+
)
14+
15+
set(LLAMA_BUILD_TESTS OFF)
16+
set(LLAMA_BUILD_EXAMPLES OFF)
17+
set(LLAMA_BUILD_SERVER OFF)
18+
FetchContent_MakeAvailable(llamacpp)
Lines changed: 18 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,25 @@
11
# Copyright (C) 2019 Intel Corporation. All rights reserved.
22
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
33

4-
find_library(TENSORFLOW_LITE
5-
NAMES tensorflow-lite
6-
HINTS ${CMAKE_CURRENT_BINARY_DIR}/tensorflow-lite
7-
NO_DEFAULT_PATHS
4+
include(FetchContent)
5+
6+
set(TFLITE_SOURCE_DIR "${WAMR_ROOT_DIR}/core/deps/tensorflow-src")
7+
8+
FetchContent_Declare(
9+
tensorflow_lite
10+
GIT_REPOSITORY https://github.com/tensorflow/tensorflow.git
11+
GIT_TAG v2.12.0
12+
GIT_SHALLOW ON
13+
GIT_PROGRESS ON
14+
SOURCE_DIR ${TFLITE_SOURCE_DIR}
15+
SOURCE_SUBDIR tensorflow/lite
816
)
917

10-
if(NOT TENSORFLOW_LITE)
11-
if(NOT EXISTS "${WAMR_ROOT_DIR}/core/deps/tensorflow-src")
12-
execute_process(
13-
COMMAND "${WAMR_ROOT_DIR}/core/deps/install_tensorflow.sh"
14-
RESULT_VARIABLE TENSORFLOW_RESULT
15-
)
16-
else()
17-
message("Tensorflow is already downloaded.")
18-
endif()
19-
20-
set(TENSORFLOW_SOURCE_DIR "${WAMR_ROOT_DIR}/core/deps/tensorflow-src")
21-
22-
if(WAMR_BUILD_WASI_NN_ENABLE_GPU EQUAL 1)
23-
# Tensorflow specific:
24-
# * https://www.tensorflow.org/lite/guide/build_cmake#available_options_to_build_tensorflow_lite
25-
set (TFLITE_ENABLE_GPU ON)
26-
endif()
27-
28-
if (CMAKE_SIZEOF_VOID_P EQUAL 4)
29-
set (TFLITE_ENABLE_XNNPACK OFF)
30-
endif()
31-
32-
add_subdirectory(
33-
"${TENSORFLOW_SOURCE_DIR}/tensorflow/lite"
34-
"${CMAKE_CURRENT_BINARY_DIR}/tensorflow-lite"
35-
EXCLUDE_FROM_ALL
36-
)
37-
else ()
38-
message(STATUS "TensorFlow Lite library found: ${TENSORFLOW_LITE}")
39-
set(TENSORFLOW_SOURCE_DIR "${WAMR_ROOT_DIR}/core/deps/tensorflow-src")
18+
if(WAMR_BUILD_WASI_NN_ENABLE_GPU EQUAL 1)
19+
set(TFLITE_ENABLE_GPU ON)
20+
endif()
21+
if (CMAKE_SIZEOF_VOID_P EQUAL 4)
22+
set(TFLITE_ENABLE_XNNPACK OFF)
4023
endif()
4124

42-
set(TENSORFLOW_LITE_INCLUDE_DIR "${TENSORFLOW_SOURCE_DIR}/tensorflow/lite")
43-
set(FLATBUFFER_INCLUDE_DIR "${CMAKE_CURRENT_BINARY_DIR}/flatbuffers/include")
44-
45-
include_directories(${TENSORFLOW_SOURCE_DIR})
46-
include_directories(${FLATBUFFER_INCLUDE_DIR})
47-
link_directories(${CMAKE_CURRENT_BINARY_DIR}/tensorflow-lite)
25+
FetchContent_MakeAvailable(tensorflow_lite)

core/iwasm/libraries/wasi-nn/cmake/wasi_nn.cmake

Lines changed: 58 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,27 +3,6 @@
33

44
list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_LIST_DIR})
55

6-
if(WAMR_BUILD_WASI_NN_TFLITE EQUAL 1)
7-
# Find tensorflow-lite
8-
find_package(tensorflow_lite REQUIRED)
9-
endif()
10-
11-
if(WAMR_BUILD_WASI_NN_OPENVINO EQUAL 1)
12-
if(NOT DEFINED ENV{OpenVINO_DIR})
13-
message(FATAL_ERROR
14-
"OpenVINO_DIR is not defined. "
15-
"Please follow https://docs.openvino.ai/2024/get-started/install-openvino.html,"
16-
"install openvino, and set environment variable OpenVINO_DIR."
17-
"Like OpenVINO_DIR=/usr/lib/openvino-2023.2/ cmake ..."
18-
"Or OpenVINO_DIR=/opt/intel/openvino/ cmake ..."
19-
)
20-
endif()
21-
22-
list(APPEND CMAKE_MODULE_PATH $ENV{OpenVINO_DIR})
23-
# Find OpenVINO
24-
find_package(OpenVINO REQUIRED COMPONENTS Runtime)
25-
endif()
26-
276
#
287
# wasi-nn general
298
set(WASI_NN_ROOT ${CMAKE_CURRENT_LIST_DIR}/..)
@@ -42,22 +21,46 @@ add_compile_definitions(
4221
#
4322
# - tflite
4423
if(WAMR_BUILD_WASI_NN_TFLITE EQUAL 1)
24+
find_package(tensorflow_lite REQUIRED)
25+
4526
add_library(
4627
wasi_nn_tflite
4728
SHARED
4829
${WASI_NN_ROOT}/src/wasi_nn_tensorflowlite.cpp
4930
)
5031

32+
target_include_directories(
33+
wasi_nn_tflite
34+
PUBLIC
35+
${tensorflow_lite_SOURCE_DIR}
36+
)
37+
5138
target_link_libraries(
5239
wasi_nn_tflite
5340
PUBLIC
5441
libiwasm
5542
tensorflow-lite
5643
)
44+
45+
install(TARGETS wasi_nn_tflite DESTINATION lib)
5746
endif()
5847

5948
# - openvino
6049
if(WAMR_BUILD_WASI_NN_OPENVINO EQUAL 1)
50+
if(NOT DEFINED ENV{OpenVINO_DIR})
51+
message(FATAL_ERROR
52+
"OpenVINO_DIR is not defined. "
53+
"Please follow https://docs.openvino.ai/2024/get-started/install-openvino.html,"
54+
"install openvino, and set environment variable OpenVINO_DIR."
55+
"Like OpenVINO_DIR=/usr/lib/openvino-2023.2/ cmake ..."
56+
"Or OpenVINO_DIR=/opt/intel/openvino/ cmake ..."
57+
)
58+
endif()
59+
60+
list(APPEND CMAKE_MODULE_PATH $ENV{OpenVINO_DIR})
61+
# Find OpenVINO
62+
find_package(OpenVINO REQUIRED COMPONENTS Runtime)
63+
6164
add_library(
6265
wasi_nn_openvino
6366
SHARED
@@ -71,4 +74,37 @@ if(WAMR_BUILD_WASI_NN_OPENVINO EQUAL 1)
7174
openvino::runtime
7275
openvino::runtime::c
7376
)
74-
endif()
77+
78+
install(TARGETS wasi_nn_openvino DESTINATION lib)
79+
endif()
80+
81+
# - llamacpp
82+
83+
if(WAMR_BUILD_WASI_NN_LLAMACPP EQUAL 1)
84+
find_package(cjson REQUIRED)
85+
find_package(llamacpp REQUIRED)
86+
87+
add_library(
88+
wasi_nn_llamacpp
89+
SHARED
90+
${WASI_NN_ROOT}/src/wasi_nn_llamacpp.c
91+
)
92+
93+
target_include_directories(
94+
wasi_nn_llamacpp
95+
PUBLIC
96+
${cjson_SOURCE_DIR}
97+
)
98+
99+
target_link_libraries(
100+
wasi_nn_llamacpp
101+
PUBLIC
102+
libiwasm
103+
cjson
104+
common
105+
ggml
106+
llama
107+
)
108+
109+
install(TARGETS wasi_nn_llamacpp DESTINATION lib)
110+
endif()

core/iwasm/libraries/wasi-nn/include/wasi_nn_types.h

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,11 @@ typedef enum {
4343
security,
4444
// The operation failed for an unspecified reason.
4545
unknown,
46+
// for WasmEdge-wasi-nn
47+
end_of_sequence = 100, // End of Sequence Found.
48+
context_full = 101, // Context Full.
49+
prompt_tool_long = 102, // Prompt Too Long.
50+
model_not_found = 103, // Model Not Found.
4651
} wasi_nn_error;
4752

4853
/**
@@ -140,6 +145,9 @@ typedef uint32_t graph_execution_context;
140145
typedef wasi_nn_error (*LOAD)(void *, graph_builder_array *, graph_encoding,
141146
execution_target, graph *);
142147
typedef wasi_nn_error (*LOAD_BY_NAME)(void *, const char *, uint32_t, graph *);
148+
typedef wasi_nn_error (*LOAD_BY_NAME_WITH_CONFIG)(void *, const char *,
149+
uint32_t, void *, uint32_t,
150+
graph *);
143151
typedef wasi_nn_error (*INIT_EXECUTION_CONTEXT)(void *, graph,
144152
graph_execution_context *);
145153
typedef wasi_nn_error (*SET_INPUT)(void *, graph_execution_context, uint32_t,
@@ -154,6 +162,7 @@ typedef wasi_nn_error (*BACKEND_DEINITIALIZE)(void *);
154162
typedef struct {
155163
LOAD load;
156164
LOAD_BY_NAME load_by_name;
165+
LOAD_BY_NAME_WITH_CONFIG load_by_name_with_config;
157166
INIT_EXECUTION_CONTEXT init_execution_context;
158167
SET_INPUT set_input;
159168
COMPUTE compute;

0 commit comments

Comments
 (0)