Skip to content
This repository was archived by the owner on Nov 10, 2025. It is now read-only.

Commit 14bdbbd

Browse files
docs: add BUILDING_TOOLS.md (#443)
1 parent 3418351 commit 14bdbbd

File tree

1 file changed

+335
-0
lines changed

1 file changed

+335
-0
lines changed

BUILDING_TOOLS.md

Lines changed: 335 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,335 @@
1+
## Building CrewAI Tools
2+
3+
This guide shows you how to build high‑quality CrewAI tools that match the patterns in this repository and are ready to be merged. It focuses on: architecture, conventions, environment variables, dependencies, testing, documentation, and a complete example.
4+
5+
### Who this is for
6+
- Contributors creating new tools under `crewai_tools/tools/*`
7+
- Maintainers reviewing PRs for consistency and DX
8+
9+
---
10+
11+
## Quick‑start checklist
12+
1. Create a new folder under `crewai_tools/tools/<your_tool_name>/` with a `README.md` and a `<your_tool_name>.py`.
13+
2. Implement a class that ends with `Tool` and subclasses `BaseTool` (or `RagTool` when appropriate).
14+
3. Define a Pydantic `args_schema` with explicit field descriptions and validation.
15+
4. Declare `env_vars` and `package_dependencies` in the class when needed.
16+
5. Lazily initialize clients in `__init__` or `_run` and handle missing credentials with clear errors.
17+
6. Implement `_run(...) -> str | dict` and, if needed, `_arun(...)`.
18+
7. Add tests under `tests/tools/` (unit, no real network calls; mock or record safely).
19+
8. Add a concise tool `README.md` with usage and required env vars.
20+
9. If you add optional dependencies, register them in `pyproject.toml` under `[project.optional-dependencies]` and reference that extra in your tool docs.
21+
10. Run `uv run pytest` and `pre-commit run -a` locally; ensure green.
22+
23+
---
24+
25+
## Tool anatomy and conventions
26+
27+
### BaseTool pattern
28+
All tools follow this structure:
29+
30+
```python
31+
from typing import Any, List, Optional, Type
32+
33+
import os
34+
from pydantic import BaseModel, Field
35+
from crewai.tools import BaseTool, EnvVar
36+
37+
38+
class MyToolInput(BaseModel):
39+
"""Input schema for MyTool."""
40+
query: str = Field(..., description="Your input description here")
41+
limit: int = Field(5, ge=1, le=50, description="Max items to return")
42+
43+
44+
class MyTool(BaseTool):
45+
name: str = "My Tool"
46+
description: str = "Explain succinctly what this tool does and when to use it."
47+
args_schema: Type[BaseModel] = MyToolInput
48+
49+
# Only include when applicable
50+
env_vars: List[EnvVar] = [
51+
EnvVar(name="MY_API_KEY", description="API key for My service", required=True),
52+
]
53+
package_dependencies: List[str] = ["my-sdk"]
54+
55+
def __init__(self, **kwargs: Any) -> None:
56+
super().__init__(**kwargs)
57+
# Lazy import to keep base install light
58+
try:
59+
import my_sdk # noqa: F401
60+
except Exception as exc:
61+
raise ImportError(
62+
"Missing optional dependency 'my-sdk'. Install with: \n"
63+
" uv add crewai-tools --extra my-sdk\n"
64+
"or\n"
65+
" pip install my-sdk\n"
66+
) from exc
67+
68+
if "MY_API_KEY" not in os.environ:
69+
raise ValueError("Environment variable MY_API_KEY is required for MyTool")
70+
71+
def _run(self, query: str, limit: int = 5, **_: Any) -> str:
72+
"""Synchronous execution. Return a concise string or JSON string."""
73+
# Implement your logic here; do not print. Return the content.
74+
# Handle errors gracefully, return clear messages.
75+
return f"Processed {query} with limit={limit}"
76+
77+
async def _arun(self, *args: Any, **kwargs: Any) -> str:
78+
"""Optional async counterpart if your client supports it."""
79+
# Prefer delegating to _run when the client is thread-safe
80+
return self._run(*args, **kwargs)
81+
```
82+
83+
Key points:
84+
- Class name must end with `Tool` to be auto‑discovered by our tooling.
85+
- Use `args_schema` for inputs; always include `description` and validation.
86+
- Validate env vars early and fail with actionable errors.
87+
- Keep outputs deterministic and compact; favor `str` (possibly JSON‑encoded) or small dicts converted to strings.
88+
- Avoid printing; return the final string.
89+
90+
### Error handling
91+
- Wrap network and I/O with try/except and return a helpful message. See `BraveSearchTool` and others for patterns.
92+
- Validate required inputs and environment configuration with clear messages.
93+
- Keep exceptions user‑friendly; do not leak stack traces.
94+
95+
### Rate limiting and retries
96+
- If the upstream API enforces request pacing, implement minimal rate limiting (see `BraveSearchTool`).
97+
- Consider idempotency and backoff for transient errors where appropriate.
98+
99+
### Async support
100+
- Implement `_arun` only if your library has a true async client or your sync calls are thread‑safe.
101+
- Otherwise, delegate `_arun` to `_run` as in multiple existing tools.
102+
103+
### Returning values
104+
- Return a string (or JSON string) that’s ready to display in an agent transcript.
105+
- If returning structured data, keep it small and human‑readable. Use stable keys and ordering.
106+
107+
---
108+
109+
## RAG tools and adapters
110+
111+
If your tool is a knowledge source, consider extending `RagTool` and/or creating an adapter.
112+
113+
- `RagTool` exposes `add(...)` and a `query(question: str) -> str` contract through an `Adapter`.
114+
- See `crewai_tools/tools/rag/rag_tool.py` and adapters like `embedchain_adapter.py` and `lancedb_adapter.py`.
115+
116+
Minimal adapter example:
117+
118+
```python
119+
from typing import Any
120+
from pydantic import BaseModel
121+
from crewai_tools.tools.rag.rag_tool import Adapter, RagTool
122+
123+
124+
class MemoryAdapter(Adapter):
125+
store: list[str] = []
126+
127+
def add(self, text: str, **_: Any) -> None:
128+
self.store.append(text)
129+
130+
def query(self, question: str) -> str:
131+
# naive demo: return all text containing any word from the question
132+
tokens = set(question.lower().split())
133+
hits = [t for t in self.store if tokens & set(t.lower().split())]
134+
return "\n".join(hits) if hits else "No relevant content found."
135+
136+
137+
class MemoryRagTool(RagTool):
138+
name: str = "In‑memory RAG"
139+
description: str = "Toy RAG that stores text in memory and returns matches."
140+
adapter: Adapter = MemoryAdapter()
141+
```
142+
143+
When using external vector DBs (MongoDB, Qdrant, Weaviate), study the existing tools to follow indexing, embedding, and query configuration patterns closely.
144+
145+
---
146+
147+
## Toolkits (multiple related tools)
148+
149+
Some integrations expose a toolkit (a group of tools) rather than a single class. See Bedrock `browser_toolkit.py` and `code_interpreter_toolkit.py`.
150+
151+
Guidelines:
152+
- Provide small, focused `BaseTool` classes for each operation (e.g., `navigate`, `click`, `extract_text`).
153+
- Offer a helper `create_<name>_toolkit(...) -> Tuple[ToolkitClass, List[BaseTool]]` to create tools and manage resources.
154+
- If you open external resources (browsers, interpreters), support cleanup methods and optionally context manager usage.
155+
156+
---
157+
158+
## Environment variables and dependencies
159+
160+
### env_vars
161+
- Declare as `env_vars: List[EnvVar]` with `name`, `description`, `required`, and optional `default`.
162+
- Validate presence in `__init__` or on first `_run` call.
163+
164+
### Dependencies
165+
- List runtime packages in `package_dependencies` on the class.
166+
- If they are genuinely optional, add an extra under `[project.optional-dependencies]` in `pyproject.toml` (e.g., `tavily-python`, `serpapi`, `scrapfly-sdk`).
167+
- Use lazy imports to avoid hard deps for users who don’t need the tool.
168+
169+
---
170+
171+
## Testing
172+
173+
Place tests under `tests/tools/` and follow these rules:
174+
- Do not hit real external services in CI. Use mocks, fakes, or recorded fixtures where allowed.
175+
- Validate input validation, env var handling, error messages, and happy path output formatting.
176+
- Keep tests fast and deterministic.
177+
178+
Example skeleton (`tests/tools/my_tool_test.py`):
179+
180+
```python
181+
import os
182+
import pytest
183+
from crewai_tools.tools.my_tool.my_tool import MyTool
184+
185+
186+
def test_requires_env_var(monkeypatch):
187+
monkeypatch.delenv("MY_API_KEY", raising=False)
188+
with pytest.raises(ValueError):
189+
MyTool()
190+
191+
192+
def test_happy_path(monkeypatch):
193+
monkeypatch.setenv("MY_API_KEY", "test")
194+
tool = MyTool()
195+
result = tool.run(query="hello", limit=2)
196+
assert "hello" in result
197+
```
198+
199+
Run locally:
200+
201+
```bash
202+
uv run pytest
203+
pre-commit run -a
204+
```
205+
206+
---
207+
208+
## Documentation
209+
210+
Each tool must include a `README.md` in its folder with:
211+
- What it does and when to use it
212+
- Required env vars and optional extras (with install snippet)
213+
- Minimal usage example
214+
215+
Update the root `README.md` only if the tool introduces a new category or notable capability.
216+
217+
---
218+
219+
## Discovery and specs
220+
221+
Our internal tooling discovers classes whose names end with `Tool`. Keep your class exported from the module path under `crewai_tools/tools/...` to be picked up by scripts like `generate_tool_specs.py`.
222+
223+
---
224+
225+
## Full example: “Weather Search Tool”
226+
227+
This example demonstrates: `args_schema`, `env_vars`, `package_dependencies`, lazy imports, validation, and robust error handling.
228+
229+
```python
230+
# file: crewai_tools/tools/weather_tool/weather_tool.py
231+
from typing import Any, List, Optional, Type
232+
import os
233+
import requests
234+
from pydantic import BaseModel, Field
235+
from crewai.tools import BaseTool, EnvVar
236+
237+
238+
class WeatherToolInput(BaseModel):
239+
"""Input schema for WeatherTool."""
240+
city: str = Field(..., description="City name, e.g., 'Berlin'")
241+
country: Optional[str] = Field(None, description="ISO country code, e.g., 'DE'")
242+
units: str = Field(
243+
default="metric",
244+
description="Units system: 'metric' or 'imperial'",
245+
pattern=r"^(metric|imperial)$",
246+
)
247+
248+
249+
class WeatherTool(BaseTool):
250+
name: str = "Weather Search"
251+
description: str = (
252+
"Look up current weather for a city using a public weather API."
253+
)
254+
args_schema: Type[BaseModel] = WeatherToolInput
255+
256+
env_vars: List[EnvVar] = [
257+
EnvVar(
258+
name="WEATHER_API_KEY",
259+
description="API key for the weather service",
260+
required=True,
261+
),
262+
]
263+
package_dependencies: List[str] = ["requests"]
264+
265+
base_url: str = "https://api.openweathermap.org/data/2.5/weather"
266+
267+
def __init__(self, **kwargs: Any) -> None:
268+
super().__init__(**kwargs)
269+
if "WEATHER_API_KEY" not in os.environ:
270+
raise ValueError("WEATHER_API_KEY is required for WeatherTool")
271+
272+
def _run(self, city: str, country: Optional[str] = None, units: str = "metric") -> str:
273+
try:
274+
q = f"{city},{country}" if country else city
275+
params = {
276+
"q": q,
277+
"units": units,
278+
"appid": os.environ["WEATHER_API_KEY"],
279+
}
280+
resp = requests.get(self.base_url, params=params, timeout=10)
281+
resp.raise_for_status()
282+
data = resp.json()
283+
284+
main = data.get("weather", [{}])[0].get("main", "Unknown")
285+
desc = data.get("weather", [{}])[0].get("description", "")
286+
temp = data.get("main", {}).get("temp")
287+
feels = data.get("main", {}).get("feels_like")
288+
city_name = data.get("name", city)
289+
290+
return (
291+
f"Weather in {city_name}: {main} ({desc}). "
292+
f"Temperature: {temp}°, feels like {feels}°."
293+
)
294+
except requests.Timeout:
295+
return "Weather service timed out. Please try again later."
296+
except requests.HTTPError as e:
297+
return f"Weather service error: {e.response.status_code} {e.response.text[:120]}"
298+
except Exception as e:
299+
return f"Unexpected error fetching weather: {e}"
300+
```
301+
302+
Folder layout:
303+
304+
```
305+
crewai_tools/tools/weather_tool/
306+
├─ weather_tool.py
307+
└─ README.md
308+
```
309+
310+
And `README.md` should document env vars and usage.
311+
312+
---
313+
314+
## PR checklist
315+
- [ ] Tool lives under `crewai_tools/tools/<name>/`
316+
- [ ] Class ends with `Tool` and subclasses `BaseTool` (or `RagTool`)
317+
- [ ] Precise `args_schema` with descriptions and validation
318+
- [ ] `env_vars` declared (if any) and validated
319+
- [ ] `package_dependencies` and optional extras added in `pyproject.toml` (if any)
320+
- [ ] Clear error handling; no prints
321+
- [ ] Unit tests added (`tests/tools/`), fast and deterministic
322+
- [ ] Tool `README.md` with usage and env vars
323+
- [ ] `pre-commit` and `pytest` pass locally
324+
325+
---
326+
327+
## Tips for great DX
328+
- Keep responses short and useful—agents quote your tool output directly.
329+
- Validate early; fail fast with actionable guidance.
330+
- Prefer lazy imports; minimize default install surface.
331+
- Mirror patterns from similar tools in this repo for a consistent developer experience.
332+
333+
Happy building!
334+
335+

0 commit comments

Comments
 (0)