|
1 |
| -[wasi-nn](https://github.com/WebAssembly/wasi-nn) is a proposal to add a WASI module for machine |
2 |
| -learning. It includes: |
3 |
| - - an [explainer](docs/Explainer.md), diving into the rationale for this specification |
4 |
| - - the proposed API in [WITX |
5 |
| - form](https://github.com/WebAssembly/wasi-nn/blob/main/phases/ephemeral/witx/wasi_ephemeral_nn.witx) |
| 1 | +# `wasi-nn` |
6 | 2 |
|
7 |
| -Following the [WASI proposal |
8 |
| -process](https://github.com/WebAssembly/WASI/blob/master/docs/Process.md), this repository is a |
9 |
| -logical fork of the [main WASI repo](https://github.com/WebAssembly/WASI). |
| 3 | +A proposed [WebAssembly System Interface](https://github.com/WebAssembly/WASI) API for machine |
| 4 | +learning (ML), also known as neural networks. |
10 | 5 |
|
11 |
| ----- |
| 6 | +### Current Phase |
12 | 7 |
|
13 |
| -[](https://doi.org/10.5281/zenodo.4323447) |
14 |
| - |
15 |
| -# WebAssembly System Interface |
| 8 | +`wasi-nn` is currently in [Phase 2]. |
16 | 9 |
|
17 |
| - |
| 10 | +[Phase 2]: https://github.com/WebAssembly/WASI/blob/42fe2a3ca159011b23099c3d10b5b1d9aff2140e/docs/Proposals.md#phase-2---proposed-spec-text-available-cg--wg |
18 | 11 |
|
19 |
| -This repository is for the WebAssembly System Interface (WASI) Subgroup of the |
20 |
| -[WebAssembly Community Group]. It includes: |
21 |
| - - [charter]: describes the goals, scope and deliverables of the group |
22 |
| - - [docs]: learn more about WASI |
23 |
| - - [meetings]: schedules and agendas for video conference and in-person meetings |
24 |
| - - [phases]: the current WASI specifications |
25 |
| - - [proposals]: the status of each new specification proposal |
| 12 | +### Champions |
26 | 13 |
|
27 |
| -[charter]: Charter.md |
28 |
| -[docs]: docs/README.md |
29 |
| -[meetings]: meetings/README.md |
30 |
| -[phases]: phases/README.md |
31 |
| -[proposals]: docs/Proposals.md |
32 |
| -[WebAssembly Community Group]: https://www.w3.org/community/webassembly/ |
| 14 | +- Andrew Brown |
| 15 | +- Mingqiu Sun |
33 | 16 |
|
34 |
| -### Contributing |
| 17 | +### Phase 4 Advancement Criteria |
35 | 18 |
|
36 |
| -The [issue tracker] is the place to ask questions, make suggestions, and start discussions. |
| 19 | +`wasi-nn` must have at least two complete independent implementations. |
37 | 20 |
|
38 |
| -[issue tracker]: https://github.com/WebAssembly/WASI/issues |
| 21 | +## Table of Contents |
| 22 | + |
| 23 | +- [Introduction](#introduction) |
| 24 | +- [Goals](#goals) |
| 25 | +- [Non-goals](#non-goals) |
| 26 | +- [API walk-through](#api-walk-through) |
| 27 | + - [Use case 1](#use-case-1) |
| 28 | + - [Use case 2](#use-case-2) |
| 29 | +- [Detailed design discussion](#detailed-design-discussion) |
| 30 | + - [[Tricky design choice 1]](#tricky-design-choice-1) |
| 31 | + - [[Tricky design choice 2]](#tricky-design-choice-2) |
| 32 | +- [Considered alternatives](#considered-alternatives) |
| 33 | + - [[Alternative 1]](#alternative-1) |
| 34 | + - [[Alternative 2]](#alternative-2) |
| 35 | +- [Stakeholder Interest & Feedback](#stakeholder-interest--feedback) |
| 36 | +- [References & acknowledgements](#references--acknowledgements) |
| 37 | + |
| 38 | +### Introduction |
| 39 | + |
| 40 | +`wasi-nn` is a WASI API for performing ML inference. ML models are typically trained |
| 41 | +using a large data set, resulting in one or more files that describe the model's weights. The model |
| 42 | +is then used to compute an "inference," e.g., the probabilities of classifying an image as a set of |
| 43 | +tags. This API is concerned initially with inference, not training. |
| 44 | + |
| 45 | +Why expose ML inference as a WASI API? Though the functionality of inference can be encoded into |
| 46 | +WebAssembly, there are two primary motivations for `wasi-nn`: |
| 47 | +1. __ease of use__: an entire ecosystem already exists to train and use models (e.g., Tensorflow, |
| 48 | + ONNX, OpenVINO, etc.); `wasi-nn` is designed to make it easy to use existing model formats as-is |
| 49 | +2. __performance__: the nature of ML inference makes it amenable to hardware acceleration of various |
| 50 | + kinds; without this hardware acceleration, inference can suffer slowdowns of several hundred |
| 51 | + times. Hardware acceleration for ML is very diverse — SIMD (e.g., AVX512), GPUs, TPUs, |
| 52 | + FPGAs — and it is unlikely (impossible?) that all of these would be supported natively in |
| 53 | + WebAssembly |
| 54 | + |
| 55 | +WebAssembly programs that want to use a host's ML capabilities can access these capabilities through |
| 56 | +`wasi-nn`'s core abstractions: _backends_, _graphs_, and _tensors_. A user selects a _backend_ for |
| 57 | +inference and loads a model, instantiated as a _graph_, to use in the _backend_. Then, the user |
| 58 | +passes _tensor_ inputs to the _graph_, computes the inference, and retrieves the _tensor_ outputs. |
| 59 | + |
| 60 | +`wasi-nn` _backends_ correspond to existing ML frameworks, e.g., Tensorflow, ONNX, OpenVINO, etc. |
| 61 | +`wasi-nn` places no requirements on hosts to support specific _backends_; the API is purposefully |
| 62 | +designed to allow the largest number of ML frameworks to implement it. `wasi-nn` _graphs_ can be |
| 63 | +passed as opaque byte sequences to support any number of model formats. This makes the API |
| 64 | +framework- and format-agnostic, since we expect device vendors to provide the ML _backend_ and |
| 65 | +support for their particular _graph_ format. |
| 66 | + |
| 67 | +Users can find language bindings for `wasi-nn` at the [wasi-nn bindings] repository; request |
| 68 | +additional language support there. More information about `wasi-nn` can be found at: |
| 69 | + |
| 70 | +[wasi-nn bindings]: https://github.com/bytecodealliance/wasi-nn |
| 71 | + |
| 72 | + - Blog post: [Machine Learning in WebAssembly: Using wasi-nn in |
| 73 | + Wasmtime](https://bytecodealliance.org/articles/using-wasi-nn-in-wasmtime) |
| 74 | + - Blog post: [Implementing a WASI Proposal in Wasmtime: |
| 75 | + wasi-nn](https://bytecodealliance.org/articles/implementing-wasi-nn-in-wasmtime) |
| 76 | + - Blog post: [Neural network inferencing for PyTorch and TensorFlow with ONNX, WebAssembly System |
| 77 | + Interface, and wasi-nn](https://deislabs.io/posts/wasi-nn-onnx/) |
| 78 | + - Recorded talk: [Machine Learning with Wasm |
| 79 | + (wasi-nn)](https://www.youtube.com/watch?v=lz2I_4vvCuc) |
| 80 | + - Recorded talk: [Lightning Talk: High Performance Neural Network Inferencing Using |
| 81 | + wasi-nn](https://www.youtube.com/watch?v=jnM0tsRVM_8) |
| 82 | + |
| 83 | +### Goals |
| 84 | + |
| 85 | +The primary goal of `wasi-nn` is to allow users to perform ML inference from WebAssembly using |
| 86 | +existing models (i.e., ease of use) and with maximum performance. Though the primary focus is |
| 87 | +inference, we plan to leave open the possibility to perform ML training in the future (request |
| 88 | +training in an [issue](https://github.com/WebAssembly/wasi-nn/issues)!). |
| 89 | + |
| 90 | +Another design goal is to make the API framework- and model-agnostic; this allows for implementing |
| 91 | +the API with multiple ML frameworks and model formats. The `load` method will return an error |
| 92 | +message when an unsupported model encoding scheme is passed in. This approach is similar to how a |
| 93 | +browser deals with image or video encoding. |
| 94 | + |
| 95 | +### Non-goals |
| 96 | + |
| 97 | +wasi-nn is not designed to provide support for individual ML operations (a "model builder" API). The |
| 98 | +ML field is still evolving rapidly, with new operations and network topologies emerging |
| 99 | +continuously. It would be a challenge to define an evolving set of operations to support in the API. |
| 100 | +Instead, our approach is to start with a "model loader" API, inspired by WebNN’s model loader |
| 101 | +proposal. |
| 102 | + |
| 103 | +### API walk-through |
| 104 | + |
| 105 | +The following example describes how a user would use `wasi-nn` to classify an image. |
| 106 | + |
| 107 | +``` |
| 108 | +TODO |
| 109 | +``` |
| 110 | + |
| 111 | +<!-- |
| 112 | +More use cases go here: provide example code snippets and diagrams explaining how the API would be |
| 113 | +used to solve the given problem. |
| 114 | +--> |
| 115 | + |
| 116 | +### Detailed design discussion |
| 117 | + |
| 118 | +For the details of the API, see [wasi-nn.wit.md]. |
| 119 | + |
| 120 | +<!-- |
| 121 | +This section should mostly refer to the .wit.md file that specifies the API. This section is for |
| 122 | +any discussion of the choices made in the API which don't make sense to document in the spec file |
| 123 | +itself. |
| 124 | +--> |
| 125 | + |
| 126 | +#### Should `wasi-nn` support training models? |
| 127 | + |
| 128 | +Ideally, yes. In the near term, however, exposing (and implementing) the inference-focused API is |
| 129 | +sufficiently complex to postpone a training-capable API until later. Also, models are typically |
| 130 | +trained offline, prior to deployment, and it is unclear why training models using WASI would be an |
| 131 | +advantage over training them natively. (Conversely, the inference API does make sense: performing ML |
| 132 | +inference in a Wasm deployment is a known use case). See associated discussion |
| 133 | +[here](https://github.com/WebAssembly/wasi-nn/issues/6) and feel free to open pull requests or |
| 134 | +issues related to this that fit within the goals above. |
| 135 | + |
| 136 | +#### Should `wasi-nn` support inspecting models? |
| 137 | + |
| 138 | +Ideally, yes. The ability to inspect models would allow users to determine, at runtime, the tensor |
| 139 | +shapes of the inputs and outputs of a model. As with ML training (above), this can be added in the |
| 140 | +future. |
| 141 | + |
| 142 | +<!-- |
| 143 | +More "tricky" design choices fit here. |
| 144 | +--> |
| 145 | + |
| 146 | +### Considered alternatives |
| 147 | + |
| 148 | +There are other ways to perform ML inference from a WebAssembly program: |
| 149 | + |
| 150 | +1. a user could specify a __custom host API__ for ML tasks; this is similar to the approach taken |
| 151 | + [here](TODO). The advantages and disadvantages are in line with other "spec vs. custom" |
| 152 | + trade-offs: the user can precisely tailor the API to their use case, etc., but will not be able |
| 153 | + to switch easily between implementations. |
| 154 | +2. a user could __compile a framework and/or model to WebAssembly__; this is similar to |
| 155 | + [here](https://github.com/sonos/tract) and |
| 156 | + [here](https://blog.tensorflow.org/2020/03/introducing-webassembly-backend-for-tensorflow-js.html). |
| 157 | + The primary disadvantage to this approach is performance: WebAssembly, even with the recent |
| 158 | + addition of 128-bit SIMD, does not have optimized primitives for performing ML inference or |
| 159 | + accessing ML-optimized hardware. The performance loss can be of several orders of magnitude. |
| 160 | + |
| 161 | + |
| 162 | +### Stakeholder Interest & Feedback |
| 163 | + |
| 164 | +TODO before entering Phase 3. |
| 165 | + |
| 166 | +<!-- |
| 167 | +This should include a list of implementers who have expressed interest in implementing the proposal |
| 168 | +--> |
| 169 | + |
| 170 | +### References & acknowledgements |
| 171 | + |
| 172 | +Many thanks for valuable feedback and advice from: |
| 173 | + |
| 174 | +- [Brian Jones](https://github.com/brianjjones) |
| 175 | +- [Radu Matei](https://github.com/radu-matei) |
| 176 | +- [Steve Schoettler](TODO) |
0 commit comments