You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: update documentation, changelog, and readme for M24
Update feature-flags, configuration, channels, architecture, and
security docs to reflect ProviderKind enum, minimal default features,
Telegram auth guard, config validation, and path sanitization.
Add doc tests step to CI workflow.
Update CHANGELOG.md with Unreleased section for M24 changes.
Update README.md with new feature flags and architecture notes.
Lightweight AI agent that routes tasks across **Ollama, Claude, OpenAI, and HuggingFace**models — with semantic skill matching, vector memory, MCP tooling, and agent-to-agent communication. Ships as a single binary for Linux, macOS, and Windows.
10
+
Lightweight AI agent that routes tasks across **Ollama, Claude, OpenAI, HuggingFace, and OpenAI-compatible endpoints**(Together AI, Groq, etc.) — with semantic skill matching, vector memory, MCP tooling, and agent-to-agent communication. Ships as a single binary for Linux, macOS, and Windows.
@@ -19,7 +19,7 @@ Lightweight AI agent that routes tasks across **Ollama, Claude, OpenAI, and Hugg
19
19
20
20
**Intelligent context management.** Two-tier context pruning: Tier 1 selectively removes old tool outputs (clearing bodies from memory after persisting to SQLite) before falling back to Tier 2 LLM-based compaction, reducing unnecessary LLM calls. A token-based protection zone preserves recent context from pruning. Parallel context preparation via `try_join!` and optimized byte-length token estimation. Cross-session memory transfers knowledge between conversations with relevance filtering. Proportional budget allocation (8% summaries, 8% semantic recall, 4% cross-session, 30% code context, 50% recent history) keeps conversations efficient. Tool outputs are truncated at 30K chars with optional LLM-based summarization for large outputs. Doom-loop detection breaks runaway tool cycles after 3 identical consecutive outputs, with configurable iteration limits (default 10). ZEPH.md project config discovery walks up the directory tree and injects project-specific context when available. Config hot-reload applies runtime-safe fields (timeouts, security, memory limits) on file change without restart.
21
21
22
-
**Run anywhere.** Local models via Ollama or Candle (GGUF with Metal/CUDA), cloud APIs (Claude, OpenAI, GPT-compatible endpoints like Together AI and Groq), or all of them at once through the multi-model orchestrator with automatic fallback chains.
22
+
**Run anywhere.** Local models via Ollama or Candle (GGUF with Metal/CUDA), cloud APIs (Claude, OpenAI), OpenAI-compatible endpoints (Together AI, Groq, Fireworks) via `CompatibleProvider`, or all of them at once through the multi-model orchestrator with automatic fallback chains and `RouterProvider` for prompt-based model selection.
23
23
24
24
**Production-ready security.** Shell sandboxing with path restrictions and relative path traversal detection, pattern-based permission policy per tool, destructive command confirmation, file operation sandbox with path traversal protection, tool output overflow-to-file (with LLM-accessible paths), secret redaction (AWS, OpenAI, Anthropic, Google, GitLab), audit logging, SSRF protection (including MCP client), rate limiter with TTL-based eviction, and Trivy-scanned container images with 0 HIGH/CRITICAL CVEs.
|**Native Tool Use**| Structured tool calling via Claude tool_use and OpenAI function calling APIs; automatic fallback to text extraction for local models |[Tools](https://bug-ops.github.io/zeph/guide/tools.html)|
104
-
|**Hybrid Inference**| Ollama, Claude, OpenAI, Candle (GGUF) — local, cloud, or both |[OpenAI](https://bug-ops.github.io/zeph/guide/openai.html) · [Candle](https://bug-ops.github.io/zeph/guide/candle.html)|
108
+
|**Hybrid Inference**| Ollama, Claude, OpenAI, Candle (GGUF), Compatible (any OpenAI-compatible API) — local, cloud, or both |[OpenAI](https://bug-ops.github.io/zeph/guide/openai.html) · [Candle](https://bug-ops.github.io/zeph/guide/candle.html)|
@@ -137,7 +141,7 @@ zeph (binary) — bootstrap, AnyChannel dispatch, vault resolution (anyhow for t
137
141
└── zeph-tui — ratatui TUI dashboard with live agent metrics (optional)
138
142
```
139
143
140
-
**Error handling:** Typed errors throughout all library crates -- `AgentError` (7 variants), `ChannelError` (4 variants), `LlmError`, `MemoryError`, `SkillError`. `anyhow` is used only in `main.rs` for top-level orchestration. Shared Qdrant operations consolidated via `QdrantOps` helper. `AnyProvider` dispatch deduplicated via `delegate_provider!` macro.
144
+
**Error handling:** Typed errors throughout all library crates -- `AgentError` (7 variants), `ChannelError` (4 variants), `LlmError`, `MemoryError`, `SkillError`. `anyhow` is used only in `main.rs` for top-level orchestration. Shared Qdrant operations consolidated via `QdrantOps` helper. `AnyProvider` dispatch deduplicated via `delegate_provider!` macro.`AnyChannel` enum dispatch lives in `zeph-channels` for reuse across binaries.
141
145
142
146
**Agent decomposition:** The agent module in `zeph-core` is split into 7 submodules (`mod.rs`, `context.rs`, `streaming.rs`, `persistence.rs`, `learning.rs`, `mcp.rs`, `index.rs`) with 5 inner field-grouping structs (`MemoryState`, `SkillState`, `ContextState`, `McpState`, `IndexState`).
143
147
@@ -152,29 +156,32 @@ Deep dive: [Architecture overview](https://bug-ops.github.io/zeph/architecture/o
152
156
153
157
| Feature | Default | Description |
154
158
|---------|---------|-------------|
155
-
|`a2a`| On | A2A protocol client and server |
156
-
|`openai`| On | OpenAI-compatible provider |
157
-
|`mcp`| On | MCP client for external tool servers |
158
-
|`candle`| On | Local HuggingFace inference (GGUF) |
159
-
|`orchestrator`| On | Multi-model routing with fallback |
160
-
|`qdrant`| On | Qdrant vector search for skills and MCP tools (opt-out) |
Zeph watches the config file for changes and applies runtime-safe fields without restart. The file watcher uses 500ms debounce to avoid redundant reloads.
@@ -45,7 +57,7 @@ name = "Zeph"
45
57
max_tool_iterations = 10# Max tool loop iterations per response (default: 10)
Copy file name to clipboardExpand all lines: docs/src/guide/channels.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,7 +64,7 @@ Restrict bot access to specific Telegram usernames:
64
64
allowed_users = ["alice", "bob"]
65
65
```
66
66
67
-
When`allowed_users`is empty, the bot accepts messages from all users. Messages from unauthorized users are silently rejected with a warning log.
67
+
The`allowed_users`list **must not be empty**. The Telegram channel refuses to start without at least one allowed username to prevent accidentally exposing the bot to all users. Messages from unauthorized users are silently rejected with a warning log.
- Full errors logged to system, sanitized messages shown to users
148
+
- Full errors logged to system; user-facing messages pass through `sanitize_paths()` which replaces absolute filesystem paths (`/home/`, `/Users/`, `/root/`, `/tmp/`, `/var/`) with `[PATH]`to prevent information disclosure
137
149
- Audit trail for all tool executions (when enabled)
0 commit comments