Skip to content

Commit 2f32a6c

Browse files
committed
docs: simplify README
1 parent 5aaabdd commit 2f32a6c

File tree

4 files changed

+27
-117
lines changed

4 files changed

+27
-117
lines changed

README.md

Lines changed: 15 additions & 117 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Building blocks for **local** agents in C++.
55
> [!NOTE]
66
> This library is designed for running small language models locally using [llama.cpp](https://github.com/ggml-org/llama.cpp). If you want to call external LLM APIs, this is not the right fit.
77
8-
## Examples
8+
# Examples
99

1010
- **[Context Engineering](./examples/context-engineering/README.md)** - Use callbacks to manipulate the context between iterations of the agent loop.
1111

@@ -26,7 +26,7 @@ wget https://huggingface.co/ibm-granite/granite-4.0-micro-GGUF/resolve/main/gran
2626
> [!IMPORTANT]
2727
> If you use a different model, you will probably have to adjust the values in `ModelConfig`.
2828
29-
## Building Blocks
29+
# Building Blocks
3030

3131
We define an `agent` with the following building blocks:
3232

@@ -36,86 +36,16 @@ We define an `agent` with the following building blocks:
3636
- [Model](./#model)
3737
- [Tools](./#tools)
3838

39-
Minimal example:
40-
41-
```cpp
42-
#include "agent.h"
43-
#include "callbacks.h"
44-
#include "model.h"
45-
#include "tool.h"
46-
#include <iostream>
47-
48-
class CalculatorTool : public agent_cpp::Tool {
49-
public:
50-
common_chat_tool get_definition() const override {
51-
agent_cpp::json schema = {
52-
{"type", "object"},
53-
{"properties", {
54-
{"a", {{"type", "number"}, {"description", "First operand"}}},
55-
{"b", {{"type", "number"}, {"description", "Second operand"}}}
56-
}},
57-
{"required", {"a", "b"}}
58-
};
59-
return {"multiply", "Multiply two numbers", schema.dump()};
60-
}
61-
62-
std::string get_name() const override { return "multiply"; }
63-
64-
std::string execute(const agent_cpp::json& args) override {
65-
double a = args.at("a").get<double>();
66-
double b = args.at("b").get<double>();
67-
return std::to_string(a * b);
68-
}
69-
};
70-
71-
class LoggingCallback : public agent_cpp::Callback {
72-
public:
73-
void before_tool_execution(std::string& tool_name, std::string& args) override {
74-
std::cerr << "[TOOL] Calling " << tool_name << " with " << args << "\n";
75-
}
76-
};
77-
78-
int main() {
79-
auto model = agent_cpp::Model::create("model.gguf");
80-
81-
std::vector<std::unique_ptr<agent_cpp::Tool>> tools;
82-
tools.push_back(std::make_unique<CalculatorTool>());
83-
84-
std::vector<std::unique_ptr<agent_cpp::Callback>> callbacks;
85-
callbacks.push_back(std::make_unique<LoggingCallback>());
86-
87-
agent_cpp::Agent agent(
88-
std::move(model),
89-
std::move(tools),
90-
std::move(callbacks),
91-
"You are a helpful assistant."
92-
);
93-
94-
std::vector<common_chat_msg> messages = {{"user", "What is 42 * 17?"}};
95-
std::string response = agent.run_loop(messages);
96-
std::cout << response << std::endl;
97-
}
98-
```
99-
10039
## Agent Loop
10140

102-
In the current LLM (Large Language Models) world, and `agent` is usually a simple loop that intersperses `Model Calls` and `Tool Executions`, until a stop condition is met:
103-
104-
```mermaid
105-
graph TD
106-
User([User Input]) --> Model[Model Call]
107-
Model --> Decision{Stop Condition Met?}
108-
Decision -->|Yes| End([End])
109-
Decision -->|No| Tool[Tool Execution]
110-
Tool --> Model
111-
```
41+
In the current LLM (Large Language Models) world, and `agent` is usually a simple loop that intersperses `Model Calls` and `Tool Executions`, until a stop condition is met.
11242

11343
> [!IMPORTANT]
114-
> There are different ways to implement the stop condition.
115-
> By default we let the agent decide by generating an output *without* tool executions.
44+
> There are different ways to implement stop conditions.
45+
> By default, we let the agent decide when to end the loop, by generating an output *without* tool executions.
11646
> You can implement additional stop conditions via callbacks.
11747
118-
### Callbacks
48+
## Callbacks
11949

12050
Callbacks allow you to hook into the agent lifecycle at specific points:
12151

@@ -125,24 +55,22 @@ Callbacks allow you to hook into the agent lifecycle at specific points:
12555

12656
Use callbacks for logging, context manipulation, human-in-the-loop approval, or error recovery.
12757

128-
### Instructions
58+
## Instructions
12959

13060
A system prompt that defines the agent's behavior and capabilities. Passed to the `Agent` constructor and automatically prepended to conversations.
13161

132-
### Model
62+
## Model
13363

13464
Encapsulates **local** LLM initialization and inference using [llama.cpp](https://github.com/ggml-org/llama.cpp). This is tightly coupled to llama.cpp and requires models in GGUF format.
13565

136-
> **Architectural note:** The `Model` class is not backend-agnostic. It is built specifically for local inference with llama.cpp. There is no abstraction layer for swapping in cloud-based providers like OpenAI or Anthropic.
137-
13866
Handles:
13967

14068
- Loading GGUF model files (quantized models recommended for efficiency)
14169
- Chat template application and tokenization
14270
- Text generation with configurable sampling (temperature, top_p, top_k, etc.)
14371
- KV cache management for efficient prompt caching
14472

145-
### Tools
73+
## Tools
14674

14775
Tools extend the agent's capabilities beyond text generation. Each tool defines:
14876

@@ -152,9 +80,11 @@ Tools extend the agent's capabilities beyond text generation. Each tool defines:
15280

15381
When the model decides to use a tool, the agent parses the tool call, executes it, and feeds the result back into the conversation.
15482

155-
## Usage
83+
# Usage
15684

157-
### Option 1: FetchContent (Recommended)
85+
**C++ Standard:** Requires **C++17** or higher.
86+
87+
## Option 1: FetchContent (Recommended)
15888

15989
The easiest way to integrate agent.cpp into your CMake project:
16090

@@ -171,7 +101,7 @@ add_executable(my_app main.cpp)
171101
target_link_libraries(my_app PRIVATE agent-cpp::agent)
172102
```
173103

174-
### Option 2: Installed Package
104+
## Option 2: Installed Package
175105

176106
Build and install agent.cpp, then use `find_package`:
177107

@@ -198,7 +128,7 @@ add_executable(my_app main.cpp)
198128
target_link_libraries(my_app PRIVATE agent-cpp::agent)
199129
```
200130

201-
### Option 3: Git Submodule
131+
## Option 3: Git Submodule
202132

203133
Add agent.cpp as a submodule and include it directly:
204134

@@ -225,35 +155,3 @@ cmake -B build -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS
225155
```
226156

227157
For a complete list of build options and backend-specific instructions, see the [llama.cpp build documentation](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md).
228-
229-
### Technical Details
230-
231-
**C++ Standard:** Requires **C++17** or higher.
232-
233-
**Thread Safety:** The `Agent` and `Model` classes are **not thread-safe**. A single `Agent` or `Model` instance should not be accessed concurrently from multiple threads.
234-
235-
**Resource Sharing:** The library separates model weights from inference context to enable efficient VRAM usage:
236-
237-
- **`ModelWeights`**: Holds the immutable model weights (the heavy VRAM consumer). Create once with `ModelWeights::create(path)` and share via `std::shared_ptr` across multiple `Model` instances.
238-
- **`Model`**: Holds a context (KV cache, sampler state) and a reference to shared weights. Each `Model` has its own conversation state.
239-
240-
This architecture enables multiple concurrent agents sharing the same weights without loading them multiple times:
241-
242-
```cpp
243-
// Load weights once (heavy VRAM usage)
244-
auto weights = agent_cpp::ModelWeights::create("model.gguf");
245-
246-
// Create multiple models sharing the same weights (lightweight)
247-
auto validator_model = agent_cpp::Model::create_with_weights(weights);
248-
auto generator_model = agent_cpp::Model::create_with_weights(weights);
249-
250-
// Each agent has independent conversation state
251-
Agent validator(validator_model, validator_tools);
252-
Agent generator(generator_model, generator_tools);
253-
```
254-
255-
For simple single-agent use cases, `Model::create(path)` still works and handles everything internally.
256-
257-
**Exceptions:** All exceptions derive from `agent_cpp::Error` (which extends `std::runtime_error`), allowing you to catch all library errors with a single `catch (const agent_cpp::Error&)` block.
258-
259-
To handle tool errors gracefully without exceptions propagating, check the [`error_recovery_callback](./examples/shared) which converts tool errors into JSON results that the model can see and retry.

examples/memory/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,10 @@ This example implements a simple memory system (a single JSON file) with 3 tools
2929

3030
## Building
3131

32+
> [!IMPORTANT]
33+
> Check the [llama.cpp build documentation](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md) to find
34+
> Cmake flags you might want to pass depending on your available hardware.
35+
3236
```bash
3337
cd examples/memory
3438

examples/multi-agent/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,10 @@ class DelegateMathTool : public agent_cpp::Tool {
3838

3939
## Building
4040

41+
> [!IMPORTANT]
42+
> Check the [llama.cpp build documentation](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md) to find
43+
> Cmake flags you might want to pass depending on your available hardware.
44+
4145
```bash
4246
cd examples/multi-agent
4347

examples/shell/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ This example uses two callbacks:
2020

2121
## Building
2222

23+
> [!IMPORTANT]
24+
> Check the [llama.cpp build documentation](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md) to find
25+
> Cmake flags you might want to pass depending on your available hardware.
26+
2327
```bash
2428
cd examples/shell
2529

0 commit comments

Comments
 (0)