You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/components/router/standalone-indexer.md
+40-9Lines changed: 40 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ subtitle: Run the KV cache indexer as an independent HTTP service for querying b
7
7
8
8
## Overview
9
9
10
-
The standalone KV indexer (`dynamo-kv-indexer`) is a lightweight binary that maintains a radix tree of cached blocks and exposes HTTP endpoints for querying and managing workers. It supports two operational modes:
10
+
The standalone KV indexer (`python -m dynamo.indexer`) is a lightweight service that maintains a radix tree of cached blocks and exposes HTTP endpoints for querying and managing workers. It supports two operational modes:
11
11
12
12
-**Standalone mode** (default): subscribes to ZMQ KV event streams directly from workers. No Dynamo runtime dependencies required.
13
13
-**Dynamo runtime mode** (`--dynamo-runtime`): integrates with the Dynamo runtime for automatic worker discovery via MDC, KV event ingestion via the event plane (NATS or ZMQ), and overlap queries over the request plane for remote frontends.
@@ -56,11 +56,11 @@ If no peers are reachable, the indexer starts with an empty state.
In runtime mode, workers are discovered automatically via MDC. The `--workers` flag can still be used to register additional static workers alongside discovered ones.
|`dynamo_kvindexer_models`| Gauge | — | Number of active model+tenant indexers |
170
170
|`dynamo_kvindexer_workers`| Gauge | — | Number of registered worker instances |
171
+
|`dynamo_kvindexer_listeners`| Gauge |`status`| Number of ZMQ listeners by status (`pending`, `active`, `paused`, `failed`) |
171
172
172
173
### `POST /register` — Register an endpoint
173
174
174
175
Register a ZMQ endpoint for an instance. Each call creates or reuses the indexer for the given `(model_name, tenant_id)` pair.
176
+
Registration is non-blocking: if the worker is not up yet, the listener is accepted in `pending` state and transitions to `active` once the initial ZMQ connection succeeds.
For ZMQ-managed workers, `status` is aggregated across listeners with priority `failed > pending > active > paused`. Each listener entry may also expose a `last_error` field when the most recent startup or recv-loop attempt failed.
279
+
249
280
### `POST /query` — Query overlap for token IDs
250
281
251
282
Given raw token IDs, compute block hashes and return per-instance overlap scores (in matched tokens):
@@ -379,7 +410,7 @@ The indexer registers a query endpoint on the Dynamo request plane, allowing fro
0 commit comments