Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Secrets and local environment
.env
.env.*
!.env.example

# VCS and local agent/editor state
.git/
.gitignore
.claude/
.codex
.codex/
.vscode/
.idea/

# Python/test/build artifacts
__pycache__/
*.py[cod]
*$py.class
*.egg-info/
.pytest_cache/
.mypy_cache/
.coverage
dist/
build/

# Runtime/user data
data/sources/*
!data/sources/.gitkeep
!data/sources/README.md

# OS noise
.DS_Store
Thumbs.db
*.stackdump
8 changes: 5 additions & 3 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,10 @@ HF_TOKEN=
# (currently: nim, deepinfra, vllm_workstation).
LLM_PROFILE=deepinfra

# Provider API keys — fill in whichever profile you're using.
# Get a NIM key at https://build.nvidia.com/, a DeepInfra key at https://deepinfra.com/
# Provider API keys — fill in whichever hosted profile you're using.
# Get an OpenAI key for Codex, a NIM key at https://build.nvidia.com/,
# or a DeepInfra key at https://deepinfra.com/
OPENAI_API_KEY=
NVIDIA_NIM_API_KEY=
DEEPINFRA_API_KEY=

Expand All @@ -28,7 +30,7 @@ DEEPINFRA_API_KEY=
VLLM_API_KEY=

# Optional: override the profile's default model string (e.g. to try a smaller variant).
# Leave blank to use the profile's built-in default.
# Leave blank to use the profile's built-in default. Required for openai_compatible/local_ollama.
AGENT_MODEL=

# Agent verbosity — when true, every tool call is logged to stdout
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@

# Claude Code — settings.json, scheduled_tasks.lock, and any future state
.claude/
.codex
.codex/

# Python
__pycache__/
Expand Down
3 changes: 2 additions & 1 deletion BRAINDB_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,7 @@ curl -X POST http://localhost:8000/api/v1/entities/datasources/ingest \

### BrainDB Agent — natural language queries

`POST /api/v1/agent/query` — instead of orchestrating individual API calls, send a plain English request and let BrainDB's internal agent handle it. The agent uses the OpenAI Agents SDK with LiteLLM (provider pluggable via `LLM_PROFILE` — default `deepinfra`, `nim` also supported) and has access to all 21 BrainDB operations as function tools.
`POST /api/v1/agent/query` — instead of orchestrating individual API calls, send a plain English request and let BrainDB's internal agent handle it. The agent uses the OpenAI Agents SDK with LiteLLM (provider pluggable via `LLM_PROFILE` — default `deepinfra`, with `nim`, `codex`, and generic OpenAI-compatible local endpoints also supported) and has access to all 21 BrainDB operations as function tools.

```bash
curl -X POST http://localhost:8000/api/v1/agent/query \
Expand Down Expand Up @@ -340,6 +340,7 @@ The agent has these tools internally: `recall_memory`, `quick_search`, `save_fac
- **Self-hosted vLLM**: set `LLM_PROFILE=vllm_workstation` for a vLLM server bound to the Docker host's loopback at `:8002`. No API key needed if the server runs without auth. See [CONTRIBUTING.md](CONTRIBUTING.md) for how to add your own self-hosted profile.
- Profiles live in `braindb/config.py::_LLM_PROFILES`. Add new providers there (e.g. `together`, `openai`) by adding a dict entry — no code change required.
- Optional override: set `AGENT_MODEL=` in `.env` to use a non-default model for the active profile.
- Optional auth override: set `AGENT_API_KEY=` only if your OpenAI-compatible endpoint requires auth; copilot-api and Ollama can run without it when local auth is disabled.

**Verbose logging**: set `AGENT_VERBOSE=true` in `.env` to log every tool call to stdout (visible via `docker logs braindb_api -f`). The HTTP response stays clean — only `answer` and `max_turns`.

Expand Down
4 changes: 2 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ curl -s -X DELETE http://localhost:8000/api/v1/entities/<UUID>

**Direct API** (what's shown above) — call individual endpoints yourself. Full control, more verbose context. Good when you want to be precise about what's saved or recalled.

**Agent endpoint** — `POST /api/v1/agent/query` — send a natural language request and let BrainDB's internal agent handle it. The agent (LiteLLM with pluggable provider via `LLM_PROFILE` — default `deepinfra/google/gemma-4-31B-it`, NIM also supported) has all 21 BrainDB operations as tools. Cleaner conversation context, but slower (5-30 seconds for a query).
**Agent endpoint** — `POST /api/v1/agent/query` — send a natural language request and let BrainDB's internal agent handle it. The agent (LiteLLM with pluggable provider via `LLM_PROFILE` — default `deepinfra/google/gemma-4-31B-it`, NIM, Codex, and generic OpenAI-compatible endpoints such as copilot-api or Ollama also supported) has all 21 BrainDB operations as tools. Cleaner conversation context, but slower (5-30 seconds for a query).

```bash
# Recall via the agent
Expand Down Expand Up @@ -156,7 +156,7 @@ When debugging the agent: set `AGENT_VERBOSE=true` in `.env` and watch `docker l

## Important Notes

- `.env` contains real DB credentials and provider API keys (`DEEPINFRA_API_KEY`, `NVIDIA_NIM_API_KEY`, etc.) — **never commit it**, it is in `.gitignore`. Active provider is picked by `LLM_PROFILE` (see `braindb/config.py::_LLM_PROFILES`).
- `.env` contains real DB credentials and provider API keys (`DEEPINFRA_API_KEY`, `NVIDIA_NIM_API_KEY`, `OPENAI_API_KEY`, `AGENT_API_KEY`, etc.) — **never commit it**, it is in `.gitignore`. Active provider is picked by `LLM_PROFILE` (see `braindb/config.py::_LLM_PROFILES`).
- Always-on rules (priority 100, `always_on: true`) are returned on every `/memory/context` call
- `notes` field on any entity or relation is for running commentary — append observations over time
- Keywords are stored as both a `TEXT[]` column on the entity AND as separate keyword entities linked via `tagged_with` relations (the keyword entities carry the embeddings for semantic search)
Expand Down
6 changes: 3 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Prerequisites: Docker Desktop (or any Docker Engine), Python 3.12, a Postgres 16
git clone <repo-url> braindb
cd braindb
cp .env.example .env
# edit .env — set DATABASE_URL, pick an LLM_PROFILE, fill in the matching API key
# edit .env — set DATABASE_URL, pick an LLM_PROFILE, fill in the matching API key or OpenAI-compatible endpoint

docker network create local-network # one-time; docker-compose expects this
docker compose up -d --build
Expand Down Expand Up @@ -45,8 +45,8 @@ LiteLLM does the heavy lifting — providers are selected by a prefix in the mod
"api_key_env": "MY_PROVIDER_API_KEY",
},
```
2. Add `MY_PROVIDER_API_KEY=` to [`.env.example`](.env.example).
3. Add the env passthrough to [`docker-compose.yml`](docker-compose.yml) under the `api` service.
2. Add `MY_PROVIDER_API_KEY=` to [`.env.example`](.env.example) if the provider needs auth.
3. Add the env passthrough to [`docker-compose.yml`](docker-compose.yml) under the `api` service. OpenAI-compatible endpoints can use `LLM_PROFILE=openai_compatible` plus `AGENT_BASE_URL` / `AGENT_API_KEY` variables.
4. (Optional) Document the provider in the README and BRAINDB_GUIDE.

No other code changes required — the agent resolves model and key through `settings.resolved_agent_model` and `settings.resolved_api_key`, which read the active profile.
Expand Down
42 changes: 34 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,18 +72,28 @@ Any reachable hostname/IP works — the connecting user just needs network acces

### 4. Pick an LLM provider (for the internal agent)

The agent talks to any LiteLLM-supported backend. BrainDB ships with two profiles pre-configured: **DeepInfra** (default, fast, paid) and **NVIDIA NIM** (free tier, can be flaky).
The agent talks to any LiteLLM-supported backend. BrainDB ships with four profiles pre-configured: **DeepInfra** (default, fast, paid), **NVIDIA NIM** (free tier, can be flaky), **Codex** (`gpt-5.3-codex-spark` via OpenAI routing), and **openai_compatible** for local OpenAI-compatible APIs such as copilot-api or Ollama (`local_ollama` remains as a legacy alias).

In `.env`:
```
LLM_PROFILE=deepinfra # or 'nim' — default is 'deepinfra'
LLM_PROFILE=deepinfra # or 'codex'/'nim'/'openai_compatible' — default is 'deepinfra'
DEEPINFRA_API_KEY=... # if profile=deepinfra — get from https://deepinfra.com/
NVIDIA_NIM_API_KEY=... # if profile=nim — get from https://build.nvidia.com/
OPENAI_API_KEY=... # if profile=codex — OpenAI API key for Codex
```

Only the key matching your chosen profile needs to be filled. Leave the other blank or absent.
For a local OpenAI-compatible server such as `copilot-api`:

Adding a third provider (Together, OpenAI, local vLLM, whatever) is a two-line entry in [`braindb/config.py::_LLM_PROFILES`](braindb/config.py) + an env var — no other code changes. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the recipe.
```
LLM_PROFILE=openai_compatible
AGENT_BASE_URL=http://<host-ip>:4141/v1 # copilot-api default port
AGENT_MODEL=openai/gpt-5-mini
AGENT_API_KEY= # optional; only set if your endpoint requires auth
```

Only the key matching your chosen hosted profile needs to be filled. Leave the other blank or absent. For OpenAI-compatible local endpoints with auth disabled, leave `AGENT_API_KEY` blank.

Adding another hosted provider (Together, OpenAI, local vLLM, whatever) is usually a small entry in [`braindb/config.py::_LLM_PROFILES`](braindb/config.py) + env passthrough — see [`CONTRIBUTING.md`](CONTRIBUTING.md) for the recipe.

### 5. Create the Docker network, then bring the stack up

Expand All @@ -110,6 +120,19 @@ API at `http://localhost:8000`. Swagger UI at `http://localhost:8000/docs`. Data

Drop a markdown file into `data/sources/` and the watcher sidecar picks it up within ~7 seconds — see [File Ingestion](#file-ingestion) below.

### Operational helper

For a safer one-command workflow, use `scripts/braindb-manage.sh`:

```bash
./scripts/braindb-manage.sh start
./scripts/braindb-manage.sh update
./scripts/braindb-manage.sh status
./scripts/braindb-manage.sh logs api
```

It creates `.env` from `.env.example` if needed, ensures the `local-network` Docker network exists, starts/recreates the Compose services, and checks `http://localhost:8000/health`.

---

## Key Endpoints
Expand Down Expand Up @@ -162,7 +185,7 @@ Single `query` (string) still works for backward compatibility.
Instead of orchestrating individual API calls, you can talk to BrainDB in plain English via `POST /api/v1/agent/query`. The agent (built on the OpenAI Agents SDK + LiteLLM) decides which tools to call and returns a summary.

```bash
curl -X POST http://localhost:8000/api/v1/agent/query \
curl -X POST http://localhost:8100/api/v1/agent/query \
-H "Content-Type: application/json" \
-d '{"query":"What do you know about the user role and recent projects?"}'

Expand All @@ -173,15 +196,18 @@ The agent has 21 tools — every single BrainDB endpoint plus `delegate_to_subag

**LLM provider — pluggable via `.env`**:

`LLM_PROFILE` selects the backend. Profiles are defined in [braindb/config.py](braindb/config.py) (`_LLM_PROFILES`) — currently `deepinfra` (default, model `google/gemma-4-31B-it`) and `nim` (NVIDIA NIM, model `google/gemma-4-31b-it`). Each profile is a model-prefix + env-var pair; adding a new one is a dict entry.
`LLM_PROFILE` selects the backend. Profiles are defined in [braindb/config.py](braindb/config.py) (`_LLM_PROFILES`) — currently `deepinfra` (default, model `google/gemma-4-31B-it`), `nim` (NVIDIA NIM, model `google/gemma-4-31b-it`), `codex` (OpenAI Codex, model `gpt-5.3-codex-spark`), and `openai_compatible` (generic OpenAI-compatible `/v1` endpoints; `local_ollama` is a legacy alias).

```
LLM_PROFILE=deepinfra # or nim — default is deepinfra
LLM_PROFILE=deepinfra # or codex/nim/openai_compatible — default is deepinfra
DEEPINFRA_API_KEY=... # required if profile=deepinfra (https://deepinfra.com/)
NVIDIA_NIM_API_KEY=... # required if profile=nim (https://build.nvidia.com/)
OPENAI_API_KEY=... # required if profile=codex
AGENT_MODEL= # optional: override the profile's default model
```

For copilot-api, set `AGENT_BASE_URL=http://<host-ip>:4141/v1` and `AGENT_MODEL=openai/gpt-5-mini`. For Ollama, use `AGENT_BASE_URL=http://<ollama-host>:11434/v1` and an Ollama model such as `AGENT_MODEL=openai/llama3.2:3b`. `AGENT_API_KEY` is optional and only needed if your OpenAI-compatible endpoint enforces auth.

**Verbose logging**: set `AGENT_VERBOSE=true` in `.env` to log every tool call (entry args + exit elapsed/result) to stdout, visible via `docker logs braindb_api -f`.

---
Expand Down Expand Up @@ -276,5 +302,5 @@ It's idempotent by content hash — re-calling with the same bytes returns 200 (
- PostgreSQL 16 with `pg_trgm` and `pgvector`
- Alembic migrations
- `sentence-transformers` + `Qwen/Qwen3-Embedding-0.6B` for keyword embeddings
- `openai-agents[litellm]` + LiteLLM for the internal agent (DeepInfra / NIM / others pluggable via `LLM_PROFILE`)
- `openai-agents[litellm]` + LiteLLM for the internal agent (DeepInfra / NIM / Codex / others pluggable via `LLM_PROFILE`)
- Docker Compose — `api` + `watcher` services, external PostgreSQL
47 changes: 47 additions & 0 deletions braindb/agent/fast_path.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
"""Deterministic fast paths for simple BrainDB agent requests."""
import re
from typing import Any

from braindb.agent.tools import _save_fact_impl, _save_rule_impl

_SAVE_RE = re.compile(r"^\s*Save:\s+(?P<content>.+?)\s*$", re.IGNORECASE | re.DOTALL)
_SAVE_RULE_RE = re.compile(r"^\s*Save as rule:\s+(?P<content>.+?)\s*$", re.IGNORECASE | re.DOTALL)
_MAX_FAST_PATH_CHARS = 2000


def _content_is_safe_for_fast_path(content: str) -> bool:
return bool(content) and "?" not in content and len(content) <= _MAX_FAST_PATH_CHARS


def try_fast_path(query: str) -> dict[str, Any] | None:
"""Handle simple save requests without invoking the LLM agent loop."""
rule_match = _SAVE_RULE_RE.match(query)
if rule_match:
content = rule_match.group("content").strip()
if not _content_is_safe_for_fast_path(content):
return None
answer = _save_rule_impl(
content=content,
keywords=[],
importance=0.8,
)
status = "fast_path_error" if answer.startswith("ERROR:") else "fast_path"
return {"answer": answer, "max_turns": 0, "status": status}

save_match = _SAVE_RE.match(query)
if save_match:
content = save_match.group("content").strip()
if not _content_is_safe_for_fast_path(content):
return None
answer = _save_fact_impl(
content=content,
keywords=[],
source="user-stated",
certainty=0.9,
importance=0.7,
notes="Saved via agent fast path.",
)
status = "fast_path_error" if answer.startswith("ERROR:") else "fast_path"
return {"answer": answer, "max_turns": 0, "status": status}

return None
Loading