OpenAI API-compatible gateway for the ubq.fi ecosystem (Deno Deploy).
- Clients authenticate to
ai.ubq.fiwith a UBQ gateway token:Authorization: Bearer <token>.- Accepted tokens come from
UOS_AI_TOKENand/or API keys stored in Deno KV (created via/admin/api-keys). - Admin tokens (including Deno Deploy tokens) also grant access to client routes (
/v1/*).
- Accepted tokens come from
- The gateway does not use or forward your client token upstream.
- For upstream requests, it uses Codex CLI ChatGPT auth from
CODEX_AUTH_JSON_B64(base64 of~/.codex/auth.json). - Upstream usage/limits are tied to that OpenAI account + plan; client-provided OpenAI API keys are ignored.
- The OAuth
client_idused for refresh-token rotation is public (not a secret); the secrets are the tokens inCODEX_AUTH_JSON_B64and your client/admin tokens.
- For upstream requests, it uses Codex CLI ChatGPT auth from
Set a gateway token:
export UOS_AI_TOKEN="..."Create one (admin):
curl -sS https://ai.ubq.fi/admin/api-keys \
-H "Authorization: Bearer $DENO_DEPLOY_TOKEN" \
-H "Content-Type: application/json" \
--data '{"name":"example key","expires_at_ms":-1}' \
| jq -r .tokenHealth:
curl -sS https://ai.ubq.fi/health
curl -sS https://ai.ubq.fi/health/upstream
curl -sS https://ai.ubq.fi/health/authList models:
curl -sS https://ai.ubq.fi/v1/models \
-H "Authorization: Bearer $UOS_AI_TOKEN"Whoami (debug which auth method was used; never returns raw secrets):
curl -sS https://ai.ubq.fi/v1/auth \
-H "Authorization: Bearer $UOS_AI_TOKEN" \
| jqChat completion (OpenAI-compatible):
curl -sS https://ai.ubq.fi/v1/chat/completions \
-H "Authorization: Bearer $UOS_AI_TOKEN" \
-H "Content-Type: application/json" \
--data '{
"model": "gpt-5.1-codex-mini",
"reasoning_effort": "high",
"messages": [{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Tell me a short joke."}],
"stream": false
}'Just the assistant message text:
curl -sS https://ai.ubq.fi/v1/chat/completions \
-H "Authorization: Bearer $UOS_AI_TOKEN" \
-H "Content-Type: application/json" \
--data '{"model":"gpt-5.2-chat-latest","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Tell me a short joke."}],"stream":false}' \
| jq -r '.choices[0].message.content'Notes:
- System/developer messages are optional. When present, the gateway combines them into upstream instructions.
- The Codex upstream requires
stream: true; when you set"stream": false, the gateway buffers the upstream stream and returns a normal JSON response. - Chat completions require
model(per the OpenAI API). Responses allow omittingmodel; the gateway falls back to its configured default. - Use
reasoning_effortfor chat completions orreasoningfor responses to control reasoning level.
Streaming:
curl -N https://ai.ubq.fi/v1/chat/completions \
-H "Authorization: Bearer $UOS_AI_TOKEN" \
-H "Content-Type: application/json" \
--data '{
"model": "gpt-5.1-codex-mini",
"stream": true,
"messages": [{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Say hello in 5 different ways."}]
}'Responses (OpenAI-compatible):
curl -sS https://ai.ubq.fi/v1/responses \
-H "Authorization: Bearer $UOS_AI_TOKEN" \
-H "Content-Type: application/json" \
--data '{
"model": "gpt-5.2-chat-latest",
"reasoning": { "effort": "high" },
"instructions": "You are a helpful assistant.",
"input": "Summarize this in 1 sentence: ..."
}'Embeddings (OpenAI-compatible):
curl -sS https://ai.ubq.fi/v1/embeddings \
-H "Authorization: Bearer $UOS_AI_TOKEN" \
-H "Content-Type: application/json" \
--data '{"model":"text-embedding-3-small","input":"hello"}'Notes:
inputcan be a string or an array of strings (batching is strongly recommended).- Backed by Voyage (
voyage-4-large) and cached in Deno KV. The cache is quota-driven: it keeps writing until KV is full, then evicts the oldest entries (FIFO) and retries. - Embedding dimensionality may differ from OpenAI because the vectors come from Voyage.
- When rate limited (by Voyage or the gateway's own KV throttling), the gateway returns
429withRetry-After; clients should retry (or use the async jobs API below).
Embeddings jobs (async, gateway-specific):
Use this when you might exceed Voyage's free-tier rate limits. The gateway will either return the embeddings immediately or queue the request and let you poll for completion.
Create:
curl -sS https://ai.ubq.fi/v1/embeddings/jobs \
-H "Authorization: Bearer $UOS_AI_TOKEN" \
-H "Content-Type: application/json" \
--data '{"model":"text-embedding-3-small","input":["hello","world"]}' \
| jqPoll:
job_id="$(curl -sS https://ai.ubq.fi/v1/embeddings/jobs \
-H "Authorization: Bearer $UOS_AI_TOKEN" \
-H "Content-Type: application/json" \
--data '{"model":"text-embedding-3-small","input":"hello"}' \
| jq -r .id)"
curl -sS "https://ai.ubq.fi/v1/embeddings/jobs/${job_id}" \
-H "Authorization: Bearer $UOS_AI_TOKEN" \
| jqNotes:
- Jobs currently support
encoding_format="float"only. - When queued, the gateway responds
202withRetry-Afterandretry_after_seconds; poll untilstatus="succeeded". - Jobs are scoped to the authenticated client identity; poll using credentials that resolve to the same identity scope
used to create the job (same API key, or for GitHub/kernel auth the same
{owner, repo}attestation context). - Inputs are stored encrypted in Deno KV for up to 24h to allow deferred processing, and deleted once the job completes.
Run from this repo:
cd lib/ai.ubq.fi
export UOS_AI_TOKEN="..."
deno task ubq-ai chat --system "You are a helpful assistant." "Tell me a short joke."Client commands also accept an admin token (DENO_DEPLOY_TOKEN) when UOS_AI_TOKEN is unset.
Install on your machine:
cd lib/ai.ubq.fi
deno install -g --allow-env --allow-net --allow-read -n ubq-ai scripts/ubq-ai.tsExamples:
export UOS_AI_TOKEN="..."
ubq-ai whoami | jq
ubq-ai models | jq
ubq-ai chat --system "You are a helpful assistant." "Tell me a short joke."
ubq-ai chat --system "You are a helpful assistant." --reasoning-effort high "Solve: 24*7."
ubq-ai chat --system "You are a helpful assistant." --stream "Say hello in 5 different ways."
ubq-ai responses --instructions "You are a helpful assistant." "Summarize this in 1 sentence: ..."
ubq-ai responses --instructions "You are a helpful assistant." --reasoning-effort high "Write a short proof sketch for the pigeonhole principle."Debug (prints useful env/token fingerprints to stderr, never raw secrets):
ubq-ai -v modelsHealth probe (cron-friendly):
deno task health:check --url https://ai.ubq.fi
# auth-only (does not consume chat tokens):
deno task health:check --url https://ai.ubq.fi --auth
# or: deno run --allow-net scripts/health-check.ts --url https://ai.ubq.fi --json --authAdmin examples (uses DENO_DEPLOY_TOKEN):
export DENO_DEPLOY_TOKEN="..."
ubq-ai admin upload-auth --auth-json ~/.codex/auth.json | jq
ubq-ai admin keys create "example key"
ubq-ai admin keys create "tmp key" --expires week
ubq-ai admin keys list | jqCODEX_AUTH_JSON_B64(required): base64 of~/.codex/auth.jsonfrom a machine that rancodex login.UOS_AI_TOKEN(optional): Comma- or newline-separated client tokens accepted viaAuthorization: Bearer .... The gateway can also accept API keys stored in Deno KV (created via/admin/api-keys).DENO_DEPLOY_TOKEN(optional, recommended): Tokens accepted for admin endpoints.CODEX_BASE_URL(optional): Defaults tohttps://chatgpt.com/backend-api/codex.VOYAGEAI_API_KEY(optional): Voyage API key used for embeddings. If unset, the gateway will look for a key stored in Deno KV at["uos_ai","voyage_api_key"].CORS_ALLOW_ORIGIN(optional): Defaults to*.UOS_API_KEY_DEFAULT_USAGE_LIMIT(optional): Default usage limit for new API keys in requests/week. Defaults to50.UOS_API_KEY_DEFAULT_EXPIRY_DAYS(optional): Default expiration for new API keys in days. Defaults to90.
This validates your posted auth.json against the upstream Codex endpoint and, if valid, stores the tokens in Deno KV
(becoming the active upstream auth for subsequent requests). The gateway then fetches the upstream Codex model catalog
and stores a snapshot in KV for fallback use if upstream is temporarily unavailable.
Treat auth.json as a secret (it contains refresh tokens). Use the repo helper CLI:
cd lib/ai.ubq.fi
export DENO_DEPLOY_TOKEN="..."
deno task upload:auth --url https://ai.ubq.fiThe helper CLI uses DENO_DEPLOY_TOKEN and extracts the Codex CLI model list from your local codex binary (so
/v1/models reflects Codex-native IDs). Use --codex-bin to point at a specific binary if it is not on your PATH.
API keys are stored in Deno KV (hashed) and are only returned once on creation. Keys are prefixed with u_ for easy
identification.
Default Limits:
- Expiration: 90 days (can be overridden with
--expiresor--expires-at-ms) - Usage Limit: 50 requests/week (can be overridden with
--usage-limit) - Reset Period: Weekly (7 days, automatic)
Expiration:
expires_at_msis a Unix epoch millisecond timestamp;-1means "does not expire".- Expired keys are rejected like revoked keys.
Usage Limits:
usage_limit_requestssets maximum requests per week;-1means unlimited.usage_requeststracks current usage; resets automatically every 7 days.usage_reset_at_msis the next reset timestamp.- Rate limit errors (429) include reset time in the message.
Create (token only):
cd lib/ai.ubq.fi
export DENO_DEPLOY_TOKEN="..."
deno task ubq-ai admin keys create "example key"Create (expires in a week):
deno task ubq-ai admin keys create "tmp key" --expires weekCreate (with custom usage limit):
deno task ubq-ai admin keys create "high-volume key" --usage-limit 1000Create (unlimited usage):
deno task ubq-ai admin keys create "unlimited key" --usage-limit unlimitedKernel-attested GitHub token auth is tracked per owner/repo and can also be limited per org (owner). The default
limit is unlimited until an admin sets a per-repo or per-org limit. Limits reset weekly by default, unless window_ms
is provided.
Get repo usage/limit (admin):
export DENO_DEPLOY_TOKEN="..."
curl -sS "https://ai.ubq.fi/admin/kernel-usage?owner=acme&repo=demo" \
-H "Authorization: Bearer $DENO_DEPLOY_TOKEN" \
| jqGet org usage/limit (admin):
export DENO_DEPLOY_TOKEN="..."
curl -sS "https://ai.ubq.fi/admin/kernel-usage?owner=acme&scope=org" \
-H "Authorization: Bearer $DENO_DEPLOY_TOKEN" \
| jqSet repo limit (admin):
export DENO_DEPLOY_TOKEN="..."
curl -sS https://ai.ubq.fi/admin/kernel-usage \
-H "Authorization: Bearer $DENO_DEPLOY_TOKEN" \
-H "Content-Type: application/json" \
--data-binary '{"owner":"acme","repo":"demo","usage_limit_requests":500,"reset_usage":true}' \
| jqSet org limit (admin) with 1 request per minute:
export DENO_DEPLOY_TOKEN="..."
curl -sS https://ai.ubq.fi/admin/kernel-usage \
-H "Authorization: Bearer $DENO_DEPLOY_TOKEN" \
-H "Content-Type: application/json" \
--data-binary '{"owner":"acme","scope":"org","usage_limit_requests":1,"window_ms":60000,"reset_usage":true}' \
| jqCLI:
deno task ubq-ai admin kernel-usage get --owner acme --repo demo
deno task ubq-ai admin kernel-usage get --owner acme --scope org
deno task ubq-ai admin kernel-usage set --owner acme --repo demo --usage-limit 500 --reset-usage
deno task ubq-ai admin kernel-usage set --owner acme --scope org --usage-limit 1 --window-ms 60000 --reset-usageList (admin):
deno task ubq-ai admin keys listRevoke (admin):
deno task ubq-ai admin keys revoke --id "<id>"GET /andGET /healthGET /health/auth(Codex auth refresh check; no chat tokens used)GET /health/upstream(Codex chat probe)POST /admin/codex/auth(admin only)POST /admin/api-keys(admin only)GET /admin/api-keys(admin only)POST /admin/api-keys/revoke(admin only)GET /admin/kernel-usage(admin only)POST /admin/kernel-usage(admin only)GET /v1/authGET /v1/modelsPOST /v1/chat/completions(streaming and non-streaming)POST /v1/responses(streaming and non-streaming; non-streaming buffers upstream SSE)
export UOS_AI_TOKEN="dev-token"
export CODEX_AUTH_JSON_B64="$(base64 < ~/.codex/auth.json | tr -d '\n')"
deno task dev