You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Expand ML accelerators -> NPU/TPU
- Update Frameworks, add ONNX Runtime Web
- Update WebGL/GPU considerations, note MLTensor
- Expand model architectures supported by the WebNN API
- Fix a few typos
Fix#840
Copy file name to clipboardExpand all lines: explainer.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -129,7 +129,7 @@ Depending on the underlying hardware capabilities, these platform APIs may make
129
129
130
130
A core abstraction behind popular neural networks is a computational graph, a directed graph with its nodes corresponding to operations (ops) and input variables. One node's output value is the input to another node. The WebNN API brings this abstraction to the web.
131
131
132
-
In the WebNN API, the [`MLOperand`](https://www.w3.org/TR/webnn/#api-mloperand) objects represent input, output, and constant multi-dimensional arrays known as [tensors](https://mathworld.wolfram.com/Tensor.html). The [`MLContext`](https://www.w3.org/TR/webnn/#api-mlcontext) defines a set of operations that facilitate the construction and execution of this computational graph. Such operations may be accelerated with dedicated hardware such as the GPUs, CPUs with extensions for deep learning, or dedicated ML accelerators. These operations defined by the WebNN API are required by [models](https://github.com/webmachinelearning/webnn/blob/master/op_compatibility/first_wave_models.md) that address key application use cases. Additionally, the WebNN API provides affordances to build a computational graph, compile the graph, execute the graph, and integrate the graph with other Web APIs that provide input data to the graph e.g. media APIs for image or video frames and sensor APIs for sensory data. Please see the [programming model overview](https://www.w3.org/TR/webnn/#programming-model-overview) for details.
132
+
In the WebNN API, [`MLOperand`](https://www.w3.org/TR/webnn/#api-mloperand) objects represent input, output, and constant multi-dimensional arrays known as [tensors](https://mathworld.wolfram.com/Tensor.html). The [`MLContext`](https://www.w3.org/TR/webnn/#api-mlcontext) defines a set of operations that facilitate the construction and execution of this computational graph. Such operations may be accelerated with dedicated hardware such as the GPUs, CPUs with extensions for deep learning, or dedicated ML accelerators (aka NPUs/TPUs). These operations defined by the WebNN API are required by well-known [CNN and RNN](https://github.com/webmachinelearning/webnn/blob/master/op_compatibility/first_wave_models.md), [transformer](https://github.com/webmachinelearning/webnn/issues/375) and generative models that address key application use cases. Additionally, the WebNN API provides affordances to build a computational graph, compile the graph, execute the graph, and integrate the graph with other Web APIs that provide input data to the graph e.g. media APIs for image or video frames and sensor APIs for sensory data. Please see the [programming model overview](https://www.w3.org/TR/webnn/#programming-model-overview) for details.
133
133
134
134
The specification includes an [example](https://www.w3.org/TR/webnn/#examples) that builds, compiles, and executes a graph comprised of three ops, takes four inputs and returns one output.
135
135
@@ -244,9 +244,9 @@ To balance the needs of providing for future extensibility while ensuring maximu
244
244
245
245
### Stay the course and build machine learning solutions on WebGL/WebGPU
246
246
247
-
WebGL and WebGPU are Web API abstraction to the underlying graphics API, which could be used to implement neural network operations that run on the GPU. Popular JavaScript machine learning frameworks such as TensorFlow.js already uses WebGL and are working on a WebGPU backend. An alternative to the WebNN proposal is to continue with this architecture and rely on JavaScript frameworks implemented with these graphics abstraction to address the current and future needs of ML scenarios on the web.
247
+
WebGL and WebGPU are Web API abstraction to the underlying graphics API, which could be used to implement neural network operations that run on the GPU. Popular JavaScript machine learning frameworks such as TensorFlow.js and ONNX Runtime Web already use WebGL and/or WebGPU backends. An alternative to the WebNN API is to continue with this architecture and rely on JavaScript frameworks implemented with these graphics abstraction to address the current and future needs of ML scenarios on the web.
248
248
249
-
We believe this alternative is insufficient for two reasons. First, although graphics abstraction layers provide the flexibility of general programmability of the GPU graphics pipelines, they are unable to tap into hardware-specific optimizations and special instructions that are available to the operating system internals. The hardware ecosystem has been investing significantly in innovating in the ML space, and much of that is about improving the performance of intensive compute workloads in machine learning scenarios. Some key technologies that are important to model performance may not be uniformly accessible to applications through generic graphics pipeline states.
249
+
We believe this alternative is insufficient for two reasons. First, although graphics abstraction layers provide the flexibility of general programmability of the GPU graphics pipelines, they are unable to tap into hardware-specific optimizations and special instructions that are available to the operating system internals. The hardware ecosystem has been investing significantly in innovating in the ML space, and much of that is about improving the performance of intensive compute workloads in machine learning scenarios. Some key technologies that are important to model performance may not be uniformly accessible to applications through generic graphics pipeline states. While WebGPU shaders can be optimized for a specific model, the WebNN API operator set is optimized for the latest hardware generation and well-known models. The [tensor interface](mltensor-explainer.md) enables best-effort buffer-sharing between the WebGPU and WebNN APIs to allow implementations optimize performance when the two APIs are used together.
250
250
251
251
Secondly, the hardware diversity with numerous driver generations make conformance testing of neural network operations at the framework level more challenging. Conformance testing, compatibility, and quality assurance of hardware results have been the traditional areas of strength of the operating systems, something that should be leveraged by frameworks and applications alike. Since neural network models could be used in mission-critical scenarios such as in healthcare or industry processes, the trustworthiness of the results produced by the frameworks are of utmost importance to the users.
0 commit comments