Skip to content

Commit 1e1bc1c

Browse files
committed
Update WebNN MLGraph Cache Explainer
Address additional review feedback.
1 parent 2e15c90 commit 1e1bc1c

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

cache-explainer.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -79,28 +79,30 @@ A JS ML framework, such as ONNX Runtime Web, may need to know the input and outp
7979

8080
```webidl
8181
partial interface MLGraph {
82-
record<USVString, MLOperandDescriptor> inputDescriptors;
83-
record<USVString, MLOperandDescriptor> outputDescriptors;
82+
record<USVString, MLOperandDescriptor> inputs;
83+
record<USVString, MLOperandDescriptor> outputs;
8484
};
8585
```
8686

8787
## Considered alternatives
8888

8989
### Combined build and save
9090

91-
A separate `saveGraph` API might introduce overhead on some native ML frameworks, such as ONNX Runtime, because its implementation may need to hold the source model in the memory and recompile the source model when user code calls `saveGraph`.
91+
A separate `saveGraph()` API might introduce overhead on some native ML frameworks, such as ONNX Runtime, because its implementation may need to hold the source model in the memory and recompile the source model when user code calls `saveGraph()`.
9292

93-
An alternative consideration is to have a `buildAndSave` method. The implementation can just compile the graph once and drop the source model after the compilation.
93+
An alternative consideration is to have a `buildAndSave()` method. The implementation can just compile the graph once and drop the source model after the compilation.
9494

9595
```webidl
9696
partial interface MLGraphBuilder {
9797
Promise<MLGraph> buildAndSave(MLNamedOperands outputs, DOMString key);
9898
};
9999
```
100100

101+
However, a compliant implementation of `build()` could save the compiled model into a temporary file which is deleted unless `saveGraph()` is called later, rendering an explicit `buildAndSave()` unnecessary.
102+
101103
### Explicit vs implicit API
102104

103-
>GPU shader caching is implicit, however the difference is that a shader program is a small input and so it's easy for the site to regenerate the shader so the browser can hash it to compare with the cache. ML models on the other hand are large because of the weights. Loading all the weights just to discover that a cached version of the model is available would be a waste of time and resources. (via [comment](https://github.com/webmachinelearning/webnn/issues/807#issuecomment-2608135598))
105+
GPU shader caching is implicit, however the difference is that a shader program is a small input and so it's easy for the site to regenerate the shader so the browser can hash it to compare with the cache. ML models on the other hand are large because of the weights. Loading all the weights just to discover that a cached version of the model is available would be a waste of time and resources. (via [comment](https://github.com/webmachinelearning/webnn/issues/807#issuecomment-2608135598))
104106

105107
Furthermore, an ML model can't be compiled without the weights because the implementation may perform device-specific constant folding and memory layout optimizations.
106108

0 commit comments

Comments
 (0)