@@ -2,9 +2,58 @@ defmodule Ortex.Serving do
22 @ moduledoc """
33 `Ortex.Serving` Documentation
44
5- This is a light wrapper for using `Nx.Serving` behaviour with `Ortex`. Using `jit` and
5+ This is a lightweight wrapper for using `Nx.Serving` behaviour with `Ortex`. Using `jit` and
66 `defn` functions in this are not supported, it is strictly for serving batches to
77 an `Ortex.Model` for inference.
8+
9+ ## Examples
10+
11+ ### Inline/serverless workflow
12+
13+ To quickly create an `Ortex.Serving` and run it
14+
15+ ```elixir
16+ iex> model = Ortex.load("./models/resnet50.onnx")
17+ iex> serving = Nx.Serving.new(Ortex.Serving, model)
18+ iex> batch = Nx.Batch.stack([{Nx.broadcast(0.0, {3, 224, 224})}])
19+ iex> {result} = Nx.Serving.run(serving, batch)
20+ iex> result |> Nx.backend_transfer |> Nx.argmax(axis: 1)
21+ #Nx.Tensor<
22+ s64[1]
23+ [499]
24+ >
25+ ```
26+
27+ ### Stateful/process workflow
28+
29+ An `Ortex.Serving` can also be started in your Application's supervision tree
30+ ```elixir
31+ model = Ortex.load("./models/resnet50.onnx")
32+ children = [
33+ {Nx.Serving,
34+ serving: Nx.Serving.new(Ortex.Serving, model),
35+ name: MyServing,
36+ batch_size: 10,
37+ batch_timeout: 100}
38+ ]
39+ opts = [strategy: :one_for_one, name: OrtexServing.Supervisor]
40+ Supervisor.start_link(children, opts)
41+ ```
42+
43+ With the application started, batches can now be sent to the `Ortex.Serving` process
44+
45+ ```elixir
46+ iex> Nx.Serving.batched_run(MyServing, Nx.Batch.stack([{Nx.broadcast(0.0, {3, 224, 224})}]))
47+ ...> {#Nx.Tensor<
48+ f32[1][1000]
49+ Ortex.Backend
50+ [
51+ [...]
52+ ]
53+ >}
54+
55+ ```
56+
857 """
958
1059 @ behaviour Nx.Serving
0 commit comments