dimknaf · WarGloom · May 3, 2026 · May 3, 2026 · May 3, 2026 · May 3, 2026
diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1,34 @@
+# Secrets and local environment
+.env
+.env.*
+!.env.example
+
+# VCS and local agent/editor state
+.git/
+.gitignore
+.claude/
+.codex
+.codex/
+.vscode/
+.idea/
+
+# Python/test/build artifacts
+__pycache__/
+*.py[cod]
+*$py.class
+*.egg-info/
+.pytest_cache/
+.mypy_cache/
+.coverage
+dist/
+build/
+
+# Runtime/user data
+data/sources/*
+!data/sources/.gitkeep
+!data/sources/README.md
+
+# OS noise
+.DS_Store
+Thumbs.db
+*.stackdump
diff --git a/.env.example b/.env.example
@@ -17,8 +17,10 @@ HF_TOKEN=
 # (currently: nim, deepinfra, vllm_workstation).
 LLM_PROFILE=deepinfra
 
-# Provider API keys — fill in whichever profile you're using.
-# Get a NIM key at https://build.nvidia.com/, a DeepInfra key at https://deepinfra.com/
+# Provider API keys — fill in whichever hosted profile you're using.
+# Get an OpenAI key for Codex, a NIM key at https://build.nvidia.com/,
+# or a DeepInfra key at https://deepinfra.com/
+OPENAI_API_KEY=
 NVIDIA_NIM_API_KEY=
 DEEPINFRA_API_KEY=
 
@@ -28,7 +30,7 @@ DEEPINFRA_API_KEY=
 VLLM_API_KEY=
 
 # Optional: override the profile's default model string (e.g. to try a smaller variant).
-# Leave blank to use the profile's built-in default.
+# Leave blank to use the profile's built-in default. Required for openai_compatible/local_ollama.
 AGENT_MODEL=
 
 # Agent verbosity — when true, every tool call is logged to stdout

diff --git a/.gitignore b/.gitignore
@@ -5,6 +5,8 @@
 
 # Claude Code — settings.json, scheduled_tasks.lock, and any future state
 .claude/
+.codex
+.codex/
 
 # Python
 __pycache__/

diff --git a/BRAINDB_GUIDE.md b/BRAINDB_GUIDE.md
@@ -306,7 +306,7 @@ curl -X POST http://localhost:8000/api/v1/entities/datasources/ingest \
 
 ### BrainDB Agent — natural language queries
 
-`POST /api/v1/agent/query` — instead of orchestrating individual API calls, send a plain English request and let BrainDB's internal agent handle it. The agent uses the OpenAI Agents SDK with LiteLLM (provider pluggable via `LLM_PROFILE` — default `deepinfra`, `nim` also supported) and has access to all 21 BrainDB operations as function tools.
+`POST /api/v1/agent/query` — instead of orchestrating individual API calls, send a plain English request and let BrainDB's internal agent handle it. The agent uses the OpenAI Agents SDK with LiteLLM (provider pluggable via `LLM_PROFILE` — default `deepinfra`, with `nim`, `codex`, and generic OpenAI-compatible local endpoints also supported) and has access to all 21 BrainDB operations as function tools.
 
 ```bash
 curl -X POST http://localhost:8000/api/v1/agent/query \
@@ -340,6 +340,7 @@ The agent has these tools internally: `recall_memory`, `quick_search`, `save_fac
 - **Self-hosted vLLM**: set `LLM_PROFILE=vllm_workstation` for a vLLM server bound to the Docker host's loopback at `:8002`. No API key needed if the server runs without auth. See [CONTRIBUTING.md](CONTRIBUTING.md) for how to add your own self-hosted profile.
 - Profiles live in `braindb/config.py::_LLM_PROFILES`. Add new providers there (e.g. `together`, `openai`) by adding a dict entry — no code change required.
 - Optional override: set `AGENT_MODEL=` in `.env` to use a non-default model for the active profile.
+- Optional auth override: set `AGENT_API_KEY=` only if your OpenAI-compatible endpoint requires auth; copilot-api and Ollama can run without it when local auth is disabled.
 
 **Verbose logging**: set `AGENT_VERBOSE=true` in `.env` to log every tool call to stdout (visible via `docker logs braindb_api -f`). The HTTP response stays clean — only `answer` and `max_turns`.
 

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -127,7 +127,7 @@ curl -s -X DELETE http://localhost:8000/api/v1/entities/<UUID>
 
 **Direct API** (what's shown above) — call individual endpoints yourself. Full control, more verbose context. Good when you want to be precise about what's saved or recalled.
 
-**Agent endpoint** — `POST /api/v1/agent/query` — send a natural language request and let BrainDB's internal agent handle it. The agent (LiteLLM with pluggable provider via `LLM_PROFILE` — default `deepinfra/google/gemma-4-31B-it`, NIM also supported) has all 21 BrainDB operations as tools. Cleaner conversation context, but slower (5-30 seconds for a query).
+**Agent endpoint** — `POST /api/v1/agent/query` — send a natural language request and let BrainDB's internal agent handle it. The agent (LiteLLM with pluggable provider via `LLM_PROFILE` — default `deepinfra/google/gemma-4-31B-it`, NIM, Codex, and generic OpenAI-compatible endpoints such as copilot-api or Ollama also supported) has all 21 BrainDB operations as tools. Cleaner conversation context, but slower (5-30 seconds for a query).
 
 ```bash
 # Recall via the agent
@@ -156,7 +156,7 @@ When debugging the agent: set `AGENT_VERBOSE=true` in `.env` and watch `docker l
 
 ## Important Notes
 
-- `.env` contains real DB credentials and provider API keys (`DEEPINFRA_API_KEY`, `NVIDIA_NIM_API_KEY`, etc.) — **never commit it**, it is in `.gitignore`. Active provider is picked by `LLM_PROFILE` (see `braindb/config.py::_LLM_PROFILES`).
+- `.env` contains real DB credentials and provider API keys (`DEEPINFRA_API_KEY`, `NVIDIA_NIM_API_KEY`, `OPENAI_API_KEY`, `AGENT_API_KEY`, etc.) — **never commit it**, it is in `.gitignore`. Active provider is picked by `LLM_PROFILE` (see `braindb/config.py::_LLM_PROFILES`).
 - Always-on rules (priority 100, `always_on: true`) are returned on every `/memory/context` call
 - `notes` field on any entity or relation is for running commentary — append observations over time
 - Keywords are stored as both a `TEXT[]` column on the entity AND as separate keyword entities linked via `tagged_with` relations (the keyword entities carry the embeddings for semantic search)

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -10,7 +10,7 @@ Prerequisites: Docker Desktop (or any Docker Engine), Python 3.12, a Postgres 16
 git clone <repo-url> braindb
 cd braindb
 cp .env.example .env
-# edit .env — set DATABASE_URL, pick an LLM_PROFILE, fill in the matching API key
+# edit .env — set DATABASE_URL, pick an LLM_PROFILE, fill in the matching API key or OpenAI-compatible endpoint
 
 docker network create local-network       # one-time; docker-compose expects this
 docker compose up -d --build
@@ -45,8 +45,8 @@ LiteLLM does the heavy lifting — providers are selected by a prefix in the mod
        "api_key_env": "MY_PROVIDER_API_KEY",
    },
    ```
-2. Add `MY_PROVIDER_API_KEY=` to [`.env.example`](.env.example).
-3. Add the env passthrough to [`docker-compose.yml`](docker-compose.yml) under the `api` service.
+2. Add `MY_PROVIDER_API_KEY=` to [`.env.example`](.env.example) if the provider needs auth.
+3. Add the env passthrough to [`docker-compose.yml`](docker-compose.yml) under the `api` service. OpenAI-compatible endpoints can use `LLM_PROFILE=openai_compatible` plus `AGENT_BASE_URL` / `AGENT_API_KEY` variables.
 4. (Optional) Document the provider in the README and BRAINDB_GUIDE.
 
 No other code changes required — the agent resolves model and key through `settings.resolved_agent_model` and `settings.resolved_api_key`, which read the active profile.

diff --git a/README.md b/README.md
@@ -72,18 +72,28 @@ Any reachable hostname/IP works — the connecting user just needs network acces
 
 ### 4. Pick an LLM provider (for the internal agent)
 
-The agent talks to any LiteLLM-supported backend. BrainDB ships with two profiles pre-configured: **DeepInfra** (default, fast, paid) and **NVIDIA NIM** (free tier, can be flaky).
+The agent talks to any LiteLLM-supported backend. BrainDB ships with four profiles pre-configured: **DeepInfra** (default, fast, paid), **NVIDIA NIM** (free tier, can be flaky), **Codex** (`gpt-5.3-codex-spark` via OpenAI routing), and **openai_compatible** for local OpenAI-compatible APIs such as copilot-api or Ollama (`local_ollama` remains as a legacy alias).
 
 In `.env`:
 ```
-LLM_PROFILE=deepinfra        # or 'nim' — default is 'deepinfra'
+LLM_PROFILE=deepinfra        # or 'codex'/'nim'/'openai_compatible' — default is 'deepinfra'
 DEEPINFRA_API_KEY=...        # if profile=deepinfra — get from https://deepinfra.com/
 NVIDIA_NIM_API_KEY=...       # if profile=nim       — get from https://build.nvidia.com/
+OPENAI_API_KEY=...            # if profile=codex    — OpenAI API key for Codex
 ```
 
-Only the key matching your chosen profile needs to be filled. Leave the other blank or absent.
+For a local OpenAI-compatible server such as `copilot-api`:
 
-Adding a third provider (Together, OpenAI, local vLLM, whatever) is a two-line entry in [`braindb/config.py::_LLM_PROFILES`](braindb/config.py) + an env var — no other code changes. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the recipe.
+```
+LLM_PROFILE=openai_compatible
+AGENT_BASE_URL=http://<host-ip>:4141/v1    # copilot-api default port
+AGENT_MODEL=openai/gpt-5-mini
+AGENT_API_KEY=                             # optional; only set if your endpoint requires auth
+```
+
+Only the key matching your chosen hosted profile needs to be filled. Leave the other blank or absent. For OpenAI-compatible local endpoints with auth disabled, leave `AGENT_API_KEY` blank.
+
+Adding another hosted provider (Together, OpenAI, local vLLM, whatever) is usually a small entry in [`braindb/config.py::_LLM_PROFILES`](braindb/config.py) + env passthrough — see [`CONTRIBUTING.md`](CONTRIBUTING.md) for the recipe.
 
 ### 5. Create the Docker network, then bring the stack up
 
@@ -110,6 +120,19 @@ API at `http://localhost:8000`. Swagger UI at `http://localhost:8000/docs`. Data
 
 Drop a markdown file into `data/sources/` and the watcher sidecar picks it up within ~7 seconds — see [File Ingestion](#file-ingestion) below.
 
+### Operational helper
+
+For a safer one-command workflow, use `scripts/braindb-manage.sh`:
+
+```bash
+./scripts/braindb-manage.sh start
+./scripts/braindb-manage.sh update
+./scripts/braindb-manage.sh status
+./scripts/braindb-manage.sh logs api
+```
+
+It creates `.env` from `.env.example` if needed, ensures the `local-network` Docker network exists, starts/recreates the Compose services, and checks `http://localhost:8000/health`.
+
 ---
 
 ## Key Endpoints
@@ -162,7 +185,7 @@ Single `query` (string) still works for backward compatibility.
 Instead of orchestrating individual API calls, you can talk to BrainDB in plain English via `POST /api/v1/agent/query`. The agent (built on the OpenAI Agents SDK + LiteLLM) decides which tools to call and returns a summary.
 
 ```bash
-curl -X POST http://localhost:8000/api/v1/agent/query \
+curl -X POST http://localhost:8100/api/v1/agent/query \
   -H "Content-Type: application/json" \
   -d '{"query":"What do you know about the user role and recent projects?"}'
 
@@ -173,15 +196,18 @@ The agent has 21 tools — every single BrainDB endpoint plus `delegate_to_subag
 
 **LLM provider — pluggable via `.env`**:
 
-`LLM_PROFILE` selects the backend. Profiles are defined in [braindb/config.py](braindb/config.py) (`_LLM_PROFILES`) — currently `deepinfra` (default, model `google/gemma-4-31B-it`) and `nim` (NVIDIA NIM, model `google/gemma-4-31b-it`). Each profile is a model-prefix + env-var pair; adding a new one is a dict entry.
+`LLM_PROFILE` selects the backend. Profiles are defined in [braindb/config.py](braindb/config.py) (`_LLM_PROFILES`) — currently `deepinfra` (default, model `google/gemma-4-31B-it`), `nim` (NVIDIA NIM, model `google/gemma-4-31b-it`), `codex` (OpenAI Codex, model `gpt-5.3-codex-spark`), and `openai_compatible` (generic OpenAI-compatible `/v1` endpoints; `local_ollama` is a legacy alias).
 
 ```
-LLM_PROFILE=deepinfra         # or nim — default is deepinfra
+LLM_PROFILE=deepinfra         # or codex/nim/openai_compatible — default is deepinfra
 DEEPINFRA_API_KEY=...         # required if profile=deepinfra (https://deepinfra.com/)
 NVIDIA_NIM_API_KEY=...        # required if profile=nim (https://build.nvidia.com/)
+OPENAI_API_KEY=...             # required if profile=codex
 AGENT_MODEL=                  # optional: override the profile's default model
 ```
 
+For copilot-api, set `AGENT_BASE_URL=http://<host-ip>:4141/v1` and `AGENT_MODEL=openai/gpt-5-mini`. For Ollama, use `AGENT_BASE_URL=http://<ollama-host>:11434/v1` and an Ollama model such as `AGENT_MODEL=openai/llama3.2:3b`. `AGENT_API_KEY` is optional and only needed if your OpenAI-compatible endpoint enforces auth.
+
 **Verbose logging**: set `AGENT_VERBOSE=true` in `.env` to log every tool call (entry args + exit elapsed/result) to stdout, visible via `docker logs braindb_api -f`.
 
 ---
@@ -276,5 +302,5 @@ It's idempotent by content hash — re-calling with the same bytes returns 200 (
 - PostgreSQL 16 with `pg_trgm` and `pgvector`
 - Alembic migrations
 - `sentence-transformers` + `Qwen/Qwen3-Embedding-0.6B` for keyword embeddings
-- `openai-agents[litellm]` + LiteLLM for the internal agent (DeepInfra / NIM / others pluggable via `LLM_PROFILE`)
+- `openai-agents[litellm]` + LiteLLM for the internal agent (DeepInfra / NIM / Codex / others pluggable via `LLM_PROFILE`)
 - Docker Compose — `api` + `watcher` services, external PostgreSQL
diff --git a/braindb/agent/fast_path.py b/braindb/agent/fast_path.py
@@ -0,0 +1,47 @@
+"""Deterministic fast paths for simple BrainDB agent requests."""
+import re
+from typing import Any
+
+from braindb.agent.tools import _save_fact_impl, _save_rule_impl
+
+_SAVE_RE = re.compile(r"^\s*Save:\s+(?P<content>.+?)\s*$", re.IGNORECASE | re.DOTALL)
+_SAVE_RULE_RE = re.compile(r"^\s*Save as rule:\s+(?P<content>.+?)\s*$", re.IGNORECASE | re.DOTALL)
+_MAX_FAST_PATH_CHARS = 2000
+
+
+def _content_is_safe_for_fast_path(content: str) -> bool:
+    return bool(content) and "?" not in content and len(content) <= _MAX_FAST_PATH_CHARS
+
+
+def try_fast_path(query: str) -> dict[str, Any] | None:
+    """Handle simple save requests without invoking the LLM agent loop."""
+    rule_match = _SAVE_RULE_RE.match(query)
+    if rule_match:
+        content = rule_match.group("content").strip()
+        if not _content_is_safe_for_fast_path(content):
+            return None
+        answer = _save_rule_impl(
+            content=content,
+            keywords=[],
+            importance=0.8,
+        )
+        status = "fast_path_error" if answer.startswith("ERROR:") else "fast_path"
+        return {"answer": answer, "max_turns": 0, "status": status}
+
+    save_match = _SAVE_RE.match(query)
+    if save_match:
+        content = save_match.group("content").strip()
+        if not _content_is_safe_for_fast_path(content):
+            return None
+        answer = _save_fact_impl(
+            content=content,
+            keywords=[],
+            source="user-stated",
+            certainty=0.9,
+            importance=0.7,
+            notes="Saved via agent fast path.",
+        )
+        status = "fast_path_error" if answer.startswith("ERROR:") else "fast_path"
+        return {"answer": answer, "max_turns": 0, "status": status}
+
+    return None