Skip to content
This repository was archived by the owner on Mar 16, 2026. It is now read-only.

Commit 4c039cd

Browse files
chore(docs): add version 1.10.0
1 parent 6ca734f commit 4c039cd

File tree

17 files changed

+1724
-0
lines changed

17 files changed

+1724
-0
lines changed
Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Architecture
2+
3+
## System Architecture Diagram
4+
5+
```
6+
Planner
7+
8+
Scheduler
9+
10+
Executor
11+
12+
Agents
13+
14+
Tools
15+
16+
Memory
17+
18+
Knowledge Graph
19+
```
20+
21+
## Component Overview
22+
23+
### Planner
24+
25+
- **Role:** Converts one root task into multiple subtasks with sequential (or DAG) dependencies.
26+
- **Strategy-based planning (v1):** A **strategy selector** (keyword heuristics, optional embedding/LLM) chooses a strategy (research, code analysis, data science, document pipeline, experiment). Each strategy returns a fixed DAG of tasks (e.g. research: corpus_builder → topic_extraction → citation_graph → literature_review). If a strategy is selected and returns tasks, the planner uses that DAG; otherwise it falls back to an LLM call to break the task into steps.
27+
- **Input:** A single `Task` (e.g. the user’s high-level goal).
28+
- **Output:** A list of `Task` objects, each with an ID, description, and dependency list.
29+
- **Events:** `planner_started`, `task_created` (per subtask), `planner_finished`.
30+
- **Optional:** Adaptive planning can extend the DAG at runtime (e.g. `expand_tasks` after a task completes).
31+
32+
### Scheduler
33+
34+
- **Role:** Maintains the task DAG and determines which tasks are runnable.
35+
- **Data structure:** Directed acyclic graph (e.g. NetworkX); nodes are task IDs, edges are dependencies.
36+
- **API:** `add_tasks`, `get_ready_tasks`, `mark_completed`, `is_finished`, `get_results`, `get_completed_tasks`.
37+
- **Invariant:** Only tasks whose dependencies are all completed are returned as “ready.”
38+
39+
### Executor
40+
41+
- **Role:** Drives execution until the scheduler reports all tasks completed.
42+
- **Behavior:** In a loop, gets ready tasks, runs each via an **Agent** (with a concurrency limit, e.g. semaphore), then marks tasks completed. Optionally calls the planner to add new tasks (adaptive).
43+
- **(v1.7) Critic:** After a task completes, if critic is enabled and the task role is in `critic_roles`, a **CriticAgent** scores the result; if it requests a retry, the task is re-queued once with a retry prompt.
44+
- **(v1.7) Prefetcher:** When speculative execution and prefetch are enabled, the executor triggers background prefetch (memory + tool selection) for speculative tasks; the agent may receive a pre-warmed `prefetch_result`.
45+
- **Events:** `executor_started`, then per-task agent/task events, then `executor_finished`; (v1.7) `TASK_CRITIQUED`, `PREFETCH_HIT`, `PREFETCH_MISS`.
46+
47+
### Agents
48+
49+
- **Role:** Stateless workers that execute a single task by calling an LLM (via the provider router).
50+
- **Input:** A `Task` (description, optional dependencies context). (v1.7) Optional **message bus** (per-run pub/sub) and **prefetch_result** (pre-warmed memory + tools).
51+
- **Output:** Text result stored on the task; optionally tools are invoked in a loop until the agent returns a final answer. (v1.7) Agents can start a response with `BROADCAST: <finding>` to publish to the message bus; the finding is stripped before storing the result.
52+
- **Events:** `agent_started`, `task_started`, `task_completed`, `agent_finished`, and `tool_called` when tools are used; (v1.7) `AGENT_BROADCAST` when an agent broadcasts a finding.
53+
54+
### Tools
55+
56+
- **Role:** Named, schema-driven functions agents can call (e.g. read_file, codebase_indexer, store_memory).
57+
- **Registry:** Tools register by name; the agent receives a list of tools and calls them via a simple protocol (e.g. `TOOL: name`, `INPUT: {...}`).
58+
- **Smart tool selection (v1):** When config has `[tools] top_k > 0`, the **tool selector** embeds the task description and each tool’s name+description, computes similarity, and passes only the top-k tools to the agent. Optional `[tools] enabled` restricts to categories (e.g. research, coding, documents).
59+
- **Plugins (v1):** External packages can register tools via the `hivemind.plugins` entry point; the loader runs after built-in categories so plugin tools appear in the same registry.
60+
- **Runner:** Validates arguments against each tool’s `input_schema`, runs the tool, and returns a string result (or error message).
61+
62+
### Memory
63+
64+
- **Role:** Persistent store and retrieval of context (episodic, semantic, research, artifact).
65+
- **Store:** SQLite-backed; list, store, retrieve, delete.
66+
- **Router:** Injects relevant memories into the agent prompt (e.g. top-k by embedding similarity).
67+
- **Index:** Optional embeddings for semantic search over stored memories.
68+
69+
### Knowledge Graph
70+
71+
- **Role:** Builds and queries a graph over memory (documents, concepts, datasets, methods; edges like mentions, cites, related_to).
72+
- **Query interface (v1):** `hivemind/knowledge/query.py` provides entity search (match query text to node labels) and relationship traversal (1–2 hops). CLI: `hivemind query "diffusion models"`.
73+
- **Integration:** Can be populated from tool outputs or document pipelines and used to enrich context for later tasks.
74+
75+
### Configuration (v1)
76+
77+
- **Module:** `hivemind/config/``config_loader.py`, `schema.py`, `defaults.py`, `resolver.py` with Pydantic models.
78+
- **Locations:** `./hivemind.toml`, `./workflow.hivemind.toml`, `~/.config/hivemind/config.toml`, legacy `.hivemind/config.toml`.
79+
- **Priority:** env > project config > user config > defaults. Exposed via `get_config()`; `Swarm(config="hivemind.toml")` loads from file.
80+
81+
### Map-reduce runtime (v1)
82+
83+
- **Module:** `hivemind/swarm/map_reduce.py``swarm.map_reduce(dataset, map_fn, reduce_fn)` partitions the dataset, runs `map_fn` on each item in parallel (worker pool), then runs `reduce_fn` on the collected results. Uses the same asyncio/semaphore pattern as the executor.
84+
85+
---
86+
87+
## Event System
88+
89+
All components emit **events** to a shared **EventLog**:
90+
91+
- **Event types:** e.g. `swarm_started`, `swarm_finished`, `planner_started`, `planner_finished`, `task_created`, `task_started`, `task_completed`, `task_failed`, `agent_started`, `agent_finished`, `executor_started`, `executor_finished`, `tool_called`.
92+
- **Format:** Each event has a timestamp, type, and payload (e.g. `task_id`, `description`).
93+
- **Persistence:** Events are appended to a JSONL file (e.g. `.hivemind/events/events_<timestamp>.jsonl`).
94+
95+
The event log is used for **replay** and **telemetry** without changing runtime logic.
96+
97+
---
98+
99+
## Telemetry
100+
101+
- **Source:** Event log from a run (or a given log file).
102+
- **Metrics:** Tasks completed/failed, average task duration, average agent latency, max concurrency, task success rate.
103+
- **Use:** Post-run analysis and monitoring; can be emitted or logged when a swarm run finishes.
104+
105+
---
106+
107+
## Replay System
108+
109+
- **Input:** Path to an event log (JSONL).
110+
- **Behavior:** Load events, sort by timestamp, and produce a step-by-step transcript (e.g. `[planner_started]`, `[task_created] task_id`, `[agent_started] task_id`, …).
111+
- **Use:** Debugging and understanding execution order without re-running the swarm.

0 commit comments

Comments
 (0)