Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ For HTTP transport, the wire protocol maps to separate endpoints: `POST /vgi/{me

**Error propagation**: Server exceptions become zero-row batches with error metadata; clients receive `RpcError` with `error_type`, `error_message`, and `remote_traceback`. The transport stays clean for subsequent requests.

**Sticky sessions (HTTP, opt-in)**: With `make_wsgi_app(enable_sticky=True)`, a method body may call `ctx.open_session(state)` to register a Python object (DB cursor, model handle, file handle) in a per-worker registry; subsequent calls from the same client (inside `conn.with_session_token():`) carry a `VGI-Session` header that resolves back to the object as `ctx.session`. Eviction is TTL-driven (default 300s, override per-call via `ttl=`) or explicit (`ctx.close_session()`); `state.close()` is invoked on eviction if defined. The framework serializes concurrent calls on the same session via a per-session `RLock`; different sessions run in parallel. Misroute / expiry / cross-principal token surfaces as `SessionLostError` (typed, `error_kind="session_lost"`); drain-time opens surface as `ServerDrainingError`. Client uses `conn.with_session_token() as sess:` (auto-sends `VGI-Session-Accept: true` for the server's leak-prevention guard) and `sess.detach()` to stash a token for later resumption without firing the exit-time best-effort `DELETE /vgi/__session__`. Sticky machinery is not installed on pipe / subprocess / unix transports; `ctx.open_session` raises `RuntimeError` there. Full spec: `docs/sticky-sessions-spec.md`. Cross-language conformance group: `TestSticky` in `vgi_rpc/conformance/_pytest_suite.py`, capability-gated on `VGI-Sticky-Enabled`.
**Sticky sessions (HTTP, opt-in)**: With `make_wsgi_app(enable_sticky=True)`, a method body may call `ctx.open_session(state)` to register a Python object (DB cursor, model handle, file handle) in a per-worker registry; subsequent calls from the same client (inside `conn.with_session_token():`) carry a `VGI-Session` header that resolves back to the object as `ctx.session`. Eviction is TTL-driven (default 300s, override per-call via `ttl=`) or explicit (`ctx.close_session()`); `state.close()` is invoked on eviction if defined. The framework serializes concurrent calls on the same session via a per-session `RLock`; different sessions run in parallel. Misroute / expiry / cross-principal token surfaces as `SessionLostError` (typed, `error_kind="session_lost"`); drain-time opens surface as `ServerDrainingError`. Client uses `conn.with_session_token() as sess:` (auto-sends `VGI-Session-Accept: true` for the server's leak-prevention guard) and `sess.detach()` to stash a token for later resumption without firing the exit-time best-effort `DELETE /vgi/__session__`. Sticky machinery is not installed on pipe / subprocess / unix transports; `ctx.open_session` raises `RuntimeError` there. **Echo headers** (`sticky_echo_headers=` kwarg on `make_wsgi_app`): tell the client to replay arbitrary headers on every subsequent request in the session — emitted as `VGI-Echo-<name>: <value>` on the session-opening response, captured + replayed by the client view, exposed via `sess.current_echo_headers()`. Used for client-driven routing on platforms like Fly.io; `vgi_rpc.http.fly.fly_sticky_echo_headers()` + `vgi_rpc.http.fly.auto_server_id()` are the ~25-line Fly quickstart helpers (return `None` off Fly so the same code works everywhere). Full spec: `docs/sticky-sessions-spec.md`. Cross-language conformance group: `TestSticky` in `vgi_rpc/conformance/_pytest_suite.py`, capability-gated on `VGI-Sticky-Enabled`; `TestSticky::test_echo_header_round_trip` further capability-gated on `VGI-Sticky-Echo-Headers`.

## Code Style

Expand Down
26 changes: 26 additions & 0 deletions docs/api/http.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,32 @@ The block's exit fires a best-effort `DELETE /vgi/__session__` so handle-bearing

**HTTP-only.** Sticky machinery is not installed on pipe/subprocess/unix transports — those run as single processes where "sticky" is meaningless. `ctx.open_session` raises `RuntimeError("sticky sessions not available on this transport")` if called over a non-HTTP transport, so apps can detect-and-fall-back.

#### Client-driven routing via echo headers

Sticky LBs are not the only way to get a session-token-carrying request back to the worker that owns the session. With **echo headers**, the server tells the client (at session-open time) to attach an arbitrary set of headers on every subsequent request in the session, and the platform's edge proxy routes on those headers. Two helpers ship for [Fly.io](https://fly.io), where `fly-force-instance-id` is the proactive routing header `fly-proxy` honours:

```python
from vgi_rpc import RpcServer
from vgi_rpc.http import make_wsgi_app
from vgi_rpc.http.fly import auto_server_id, fly_sticky_echo_headers

server = RpcServer(
MyService, MyServiceImpl(),
server_id=auto_server_id(), # ⇒ FLY_MACHINE_ID on Fly, random elsewhere
)
app = make_wsgi_app(
server,
enable_sticky=True,
sticky_echo_headers=fly_sticky_echo_headers(), # ⇒ {"fly-force-instance-id": <id>} on Fly, None elsewhere
)
```

On Fly the server emits `VGI-Echo-fly-force-instance-id: <machine-id>` on session-opening responses; the client captures it and replays `fly-force-instance-id: <machine-id>` on every subsequent request in the session; fly-proxy routes directly to the owning Machine. No LB configuration required.

Off Fly the helpers return `None` so the same code is a no-op — operators don't need conditional branches.

Generic API (for non-Fly platforms): pass any `dict[str, str]` as `sticky_echo_headers` and the server will emit them as `VGI-Echo-<name>` on the session-opening response. The client's `with_session_token()` view captures + replays automatically; `sess.current_echo_headers()` exposes the captured map for inspection or stashing.

## API Reference

### Server
Expand Down
6 changes: 4 additions & 2 deletions docs/porting-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,9 +195,11 @@ ignore the budget.
HTTP sticky sessions are an **opt-in additive feature** layered on top of the stateless HTTP transport. The full spec lives at [`docs/sticky-sessions-spec.md`](sticky-sessions-spec.md). A port may choose to:

- **Skip sticky entirely.** The canonical `TestSticky` conformance group is capability-gated on the `VGI-Sticky-Enabled` header; ports that don't advertise it skip every test in the group cleanly. The non-sticky wire path is byte-identical for both implementations, so the rest of the conformance suite still passes.
- **Implement the client side only.** A client running against a Python sticky server needs to (1) recognize `error_kind="session_lost"` / `error_kind="server_draining"` on EXCEPTION-level batches and surface them as typed exceptions, (2) optionally implement a `with_session_token()`-equivalent that sends `VGI-Session-Accept: true` + `VGI-Session: <token>` on every request inside a scope and captures `VGI-Session` / `VGI-Session-Close: true` from responses. The cookie-jar avoidance is intentional — header-only multiplexes concurrent sessions cleanly.
- **Implement the full server side.** Port `_StickyMiddleware` (per-worker registry + reaper thread + token sealing), the `DELETE /vgi/__session__` resource (idempotent, principal-bound), and the `ctx.open_session` / `ctx.close_session` runtime API. The session token format from the spec is language-neutral: `created_at:u64 | server_id_len:u8 | server_id | session_id:bytes(12) | expires_at:u64`, AEAD-sealed with the same AAD shape used by stream tokens.
- **Implement the client side only.** A client running against a Python sticky server needs to (1) recognize `error_kind="session_lost"` / `error_kind="server_draining"` on EXCEPTION-level batches and surface them as typed exceptions, (2) optionally implement a `with_session_token()`-equivalent that sends `VGI-Session-Accept: true` + `VGI-Session: <token>` on every request inside a scope, captures `VGI-Session` / `VGI-Session-Close: true` from responses, and captures + replays any `VGI-Echo-<name>` response headers (case-insensitive, prefix-stripped) on subsequent requests in the same session. The cookie-jar avoidance is intentional — header-only multiplexes concurrent sessions cleanly.
- **Implement the full server side.** Port `_StickyMiddleware` (per-worker registry + reaper thread + token sealing + optional echo-header emission), the `DELETE /vgi/__session__` resource (idempotent, principal-bound), and the `ctx.open_session` / `ctx.close_session` runtime API. The session token format from the spec is language-neutral: `created_at:u64 | server_id_len:u8 | server_id | session_id:bytes(12) | expires_at:u64`, AEAD-sealed with the same AAD shape used by stream tokens.

If a port claims sticky support it MUST also implement the three sticky conformance methods (`open_counter`, `increment_counter`, `close_counter`) on its `ConformanceService` implementation, so the canonical `TestSticky` group has something to exercise. Servers that advertise `VGI-Sticky-Enabled: true` but fail `TestSticky` are non-conformant.

**Echo headers** (`VGI-Echo-<name>` response headers / `VGI-Sticky-Echo-Headers` capability advert) are a sub-feature; ports that don't implement them stay conformant on `TestSticky` core but skip `TestSticky::test_echo_header_round_trip` cleanly (the test is capability-gated on `VGI-Sticky-Echo-Headers`). Implementing them unlocks zero-LB-config deployments on Fly.io (`fly-force-instance-id`) and any other platform with header-based proactive routing. See [`vgi_rpc/http/fly.py`](https://github.com/Query-farm/vgi-rpc-python/blob/main/vgi_rpc/http/fly.py) for the Python Fly quickstart helpers — a ~25-line module that other ports can mirror directly.

Recognising the two new error kinds is the **minimum** any port should do: even ports that have no sticky implementation may end up talking to a Python sticky server in the wild, and a typed exception is much friendlier than a flat `RpcError` whose meaning the caller has to grep out of the message text.
22 changes: 18 additions & 4 deletions docs/sticky-sessions-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,23 @@ When `enable_sticky=True`, the server MUST advertise these on every response (ch
|---|---|---|
| `VGI-Sticky-Enabled` | `"true"` | Discovery flag; absent or `"false"` on non-sticky servers. |
| `VGI-Sticky-Default-TTL` | integer seconds | The TTL applied by `ctx.open_session` when its `ttl` argument is `None`. Operator-tunable via `sticky_default_ttl`. |
| `VGI-Sticky-Echo-Headers` | comma-separated header names | Headers the client must replay on every subsequent request in the session — see §2.5. Absent when `sticky_echo_headers` is unset. |

### 2.4 Framework-managed endpoints
### 2.4 Echo headers (`VGI-Echo-*`)

When the server is configured with `sticky_echo_headers={name: value, ...}`, every session-opening response (the response carrying the `VGI-Session` token) also carries `VGI-Echo-<name>: <value>` for each configured pair. The client MUST:

1. Capture each `VGI-Echo-<name>` header on the response (case-insensitive lookup).
2. Strip the `VGI-Echo-` prefix.
3. Send the inner header (`<name>: <value>`) on every subsequent request inside the same session view, until the server emits `VGI-Session-Close: true` (which clears the captured echo headers alongside the token).

Echo headers are emitted **once-only**, on the session-opening response. Subsequent responses MUST NOT re-emit them. Clients hold the captured map for the lifetime of the session view.

The primary use case is **client-driven routing**: on Fly.io the server emits `VGI-Echo-fly-force-instance-id: <machine-id>`, the client sends `fly-force-instance-id: <machine-id>` on every subsequent request, and fly-proxy routes directly to the owning Machine without any LB configuration. Other platforms with similar header-based routing (Railway, custom Envoy filters) work identically — only the header name and value change.

Echo headers carry no security guarantees beyond what the underlying transport provides; in particular they are NOT bound to the session token via AAD. A misbehaving client could echo a different header value than the server told it to. The contract assumes cooperative clients — the feature exists to make sticky routing *work*, not to enforce it.

### 2.5 Framework-managed endpoints

`DELETE {prefix}/__session__` — idempotent best-effort session teardown.

Expand Down Expand Up @@ -115,8 +130,7 @@ vgi-rpc-test --url http://<server> --filter "Sticky::*"

The group is capability-gated: servers without `VGI-Sticky-Enabled: true` skip every test in the group cleanly. The Python implementation passes all tests; cross-language ports that wire up sticky support must pass them too. See [`docs/porting-guide.md`](porting-guide.md) for the full porting checklist.

## 10. Out of scope for v1
## 10. Out of scope

- **Cookie emission.** AWS ALB application-based stickiness and CloudFront sticky sessions both require a cookie set by the application. Operators on those platforms can front with Envoy / NGINX (header-hash policies on `VGI-Session`) or switch to NLB (flow-hash). Cookie emission can be added as an additive operator flag in a follow-up without changing the v1 wire surface.
- **Client-driven routing (echo headers).** A second PR will add server-side `sticky_echo_headers` config that tells the client to echo arbitrary routing headers on subsequent requests in the same session — enabling Fly.io's `fly-force-instance-id` and similar mechanisms. The current PR is the foundation; the echo-header layer composes on top without changes to existing wire contracts.
- **Cookie emission.** AWS ALB application-based stickiness and CloudFront sticky sessions both require a cookie set by the application. Operators on those platforms can front with Envoy / NGINX (header-hash policies on `VGI-Session`) or switch to NLB (flow-hash). Cookie emission can be added as an additive operator flag in a follow-up without changing the wire surface.
- **Pluggable session store.** Sessions hold live Python objects in-process. Redis-style external stores are explicitly excluded — they don't work for the cursor/handle pattern the feature is designed for, and the additional persistence story would compete with the well-defined "TTL eviction + crash = state lost" contract.
25 changes: 24 additions & 1 deletion tests/serve_conformance_http.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,27 @@ def main() -> None:
default=False,
help="Disable sticky sessions. Default: enabled, so TestSticky conformance group runs.",
)
parser.add_argument(
"--no-sticky-echo",
action="store_true",
default=False,
help=(
"Disable the canonical sticky-echo-header advertisement. "
"Default: enabled with a fixed marker header so TestSticky::"
"test_echo_header_round_trip exercises the contract."
),
)
args = parser.parse_args()

enable_sticky = not args.no_sticky
# Fixed marker the canonical TestSticky::test_echo_header_round_trip
# captures + replays. Operators wiring up real deployments use
# vgi_rpc.http.fly.fly_sticky_echo_headers() or their own mapping —
# this constant exists only to give the conformance group a stable
# contract to exercise.
sticky_echo_headers: dict[str, str] | None = (
None if args.no_sticky_echo or not enable_sticky else {"x-vgi-conformance-echo": "conformance-fixed-marker"}
)

if not args.fake_storage:
# Plain HTTP server, no external storage.
Expand All @@ -90,7 +108,11 @@ def main() -> None:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind((args.host, 0))
port = int(s.getsockname()[1])
app = make_wsgi_app(server, enable_sticky=True)
app = make_wsgi_app(
server,
enable_sticky=True,
sticky_echo_headers=sticky_echo_headers,
)
try:
import waitress
except ImportError:
Expand Down Expand Up @@ -133,6 +155,7 @@ def main() -> None:
max_request_bytes=max_request_bytes,
max_upload_bytes=64 * 1024 * 1024,
enable_sticky=enable_sticky,
sticky_echo_headers=sticky_echo_headers,
)

try:
Expand Down
Loading
Loading