Skip to content

Commit 16551d3

Browse files
committed
add: onnxruntime instructions
1 parent 9b5cb8a commit 16551d3

File tree

1 file changed

+56
-3
lines changed

1 file changed

+56
-3
lines changed

ext/ai/README.md

Lines changed: 56 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,6 @@ features for the `Supabase.ai` namespace.
55

66
## Model Execution Engine
77

8-
`Supabase.ai` uses [onnxruntime](https://onnxruntime.ai/) as internal model
9-
execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings.
10-
118
<p align="center">
129
<picture>
1310
<source media="(prefers-color-scheme: dark)" srcset="/assets/docs/ai/onnx-backend-dark.svg">
@@ -16,10 +13,66 @@ execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings.
1613
</picture>
1714
</p>
1815

16+
`Supabase.ai` uses [onnxruntime](https://onnxruntime.ai/) as internal model
17+
execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings.
18+
1919
Following there's specific documentation for both "lands":
2020

2121
<details>
2222
<summary>Javascript/Frontend</summary>
23+
24+
The **onnxruntime** API is available from `globalThis` and shares similar specs of [onnxruntime-common](https://github.com/microsoft/onnxruntime/tree/main/js/common).
25+
26+
The available items are:
27+
28+
- `Tensor`: represent a basic tensor with specified dimensions and data type. -- "The AI input/output"
29+
- `InferenceSession`: represent the inner model session. -- "The AI model itself"
30+
31+
### Usage
32+
33+
It can be used from the exported `globalThis[Symbol.for("onnxruntime")]` --
34+
but manipulating it directly is not trivial, so in the future you may use the [Inference API #501](https://github.com/supabase/edge-runtime/pull/501) for a more user friendly API.
35+
36+
```typescript
37+
const { InferenceSession, Tensor } = globalThis[Symbol.for("onnxruntime")];
38+
39+
// 'create()' supports an url string buffer or the binary data
40+
const modelUrlBuffer = new TextEncoder().encode("https://huggingface.co/Supabase/gte-small/resolve/main/onnx/model_quantized.onnx");
41+
const session = await InferenceSession.create(modelUrlBuffer);
42+
43+
// Example only, in real 'feature-extraction' tensors must be created from the tokenizer step.
44+
const inputs = {
45+
input_ids: new Tensor('float32', [1, 2, 3...], [1, 384]),
46+
attention_mask: new Tensor('float32', [...], [1, 384]),
47+
token_types_ids: new Tensor('float32', [...], [1, 384])
48+
};
49+
50+
const { last_hidden_state } = await session.run(inputs);
51+
console.log(last_hidden_state);
52+
```
53+
54+
### Third party libs
55+
56+
Originaly this backend was created to implicit integrate with [transformers.js](https://github.com/huggingface/transformers.js/). This way users can still consuming a high-level lib at same time they benefits of all Supabase's Model Execution Engine features, like model optimization and caching. For further information pleas check the [PR #436](https://github.com/supabase/edge-runtime/pull/436)
57+
58+
> [!WARNING]
59+
> At this moment users need to explicit target `device: 'auto'` to enable the platform compatibility.
60+
61+
```typescript
62+
import { env, pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/[email protected]';
63+
64+
// Broswer cache is now supported for `onnx` models
65+
env.useBrowserCache = true;
66+
env.allowLocalModels = false;
67+
68+
const pipe = await pipeline('feature-extraction', 'supabase/gte-small', { device: 'auto' });
69+
70+
const output = await pipe("This embed will be generated from rust land", {
71+
pooling: 'mean',
72+
normalize: true
73+
});
74+
```
75+
2376
</details>
2477

2578
<details>

0 commit comments

Comments
 (0)