Skip to content

Commit 8981743

Browse files
feat: agent&tool docs/tests/examples (#319)
* feat: init agent&tool architecture * feat: agent&tool docs/tests/examples * fix bugs
1 parent b413024 commit 8981743

File tree

10 files changed

+2191
-2
lines changed

10 files changed

+2191
-2
lines changed

.github/workflows/IntegrationTest.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ jobs:
2626
pip install pytest
2727
if [ -f requirements/runtime.txt ]; then pip install -r requirements/runtime.txt; fi
2828
pip install pyspark
29+
pip install tavily-python
2930
pip install -e .
3031
3132
- name: Check Python syntax and imports

README.md

Lines changed: 66 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@
6969

7070
🤖 **RAG System Assessment** - Comprehensive evaluation of retrieval and generation quality with 5 academic-backed metrics
7171

72-
🧠 **LLM & Rule Hybrid** - Combine fast heuristic rules (30+ built-in) with LLM-based deep assessment
72+
🧠 **LLM & Rule & Agent Hybrid** - Combine fast heuristic rules (30+ built-in) with LLM-based deep assessment
7373

7474
🚀 **Flexible Execution** - Run locally for rapid iteration or scale with Spark for billion-scale datasets
7575

@@ -384,6 +384,12 @@ input_data = {
384384
✅ Vision-Language Models (InternVL, Gemini)
385385
✅ Custom prompt registration
386386

387+
**Agent-Based** - Multi-step reasoning with tools
388+
✅ Web search integration (Tavily)
389+
✅ Adaptive context gathering
390+
✅ Multi-source fact verification
391+
✅ Custom agent & tool registration
392+
387393
**Extensible Architecture**
388394
✅ Plugin-based rule/prompt/model registration
389395
✅ Clean separation of concerns (agents, tools, orchestration)
@@ -492,6 +498,65 @@ class CustomEvaluator(BaseOpenAI):
492498
- [Custom Rules](examples/register/sdk_register_rule.py)
493499
- [Custom Models](examples/register/sdk_register_llm.py)
494500

501+
### Agent-Based Evaluation with Tools
502+
503+
Dingo supports agent-based evaluators that can use external tools for multi-step reasoning and adaptive context gathering:
504+
505+
```python
506+
from dingo.io import Data
507+
from dingo.io.output.eval_detail import EvalDetail
508+
from dingo.model import Model
509+
from dingo.model.llm.agent.base_agent import BaseAgent
510+
511+
@Model.llm_register('MyAgent')
512+
class MyAgent(BaseAgent):
513+
"""Custom agent with tool support"""
514+
515+
available_tools = ["tavily_search", "my_custom_tool"]
516+
max_iterations = 5
517+
518+
@classmethod
519+
def eval(cls, input_data: Data) -> EvalDetail:
520+
# Use tools for fact-checking
521+
search_result = cls.execute_tool('tavily_search', query=input_data.content)
522+
523+
# Multi-step reasoning with LLM
524+
result = cls.send_messages([...])
525+
526+
return EvalDetail(...)
527+
```
528+
529+
**Built-in Agent:**
530+
- `AgentHallucination`: Enhanced hallucination detection with web search fallback
531+
532+
**Configuration Example:**
533+
```json
534+
{
535+
"evaluator": [{
536+
"evals": [{
537+
"name": "AgentHallucination",
538+
"config": {
539+
"key": "openai-api-key",
540+
"model": "gpt-4",
541+
"parameters": {
542+
"agent_config": {
543+
"max_iterations": 5,
544+
"tools": {
545+
"tavily_search": {"api_key": "tavily-key"}
546+
}
547+
}
548+
}
549+
}
550+
}]
551+
}]
552+
}
553+
```
554+
555+
**Learn More:**
556+
- [Agent Development Guide](docs/agent_development_guide.md) - Comprehensive guide for creating custom agents and tools
557+
- [AgentHallucination Example](examples/agent/agent_hallucination_example.py) - Production agent example
558+
- [AgentFactCheck Example](examples/agent/agent_executor_example.py) - LangChain agent example
559+
495560
## ⚙️ Execution Modes
496561

497562
### Local Executor (Development & Small-Scale)

README_zh-CN.md

Lines changed: 66 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@
6868

6969
🤖 **RAG 系统评估** - 使用 5 个学术支持的指标全面评估检索和生成质量
7070

71-
🧠 **LLM 与规则混合** - 结合快速启发式规则(30+ 内置规则)和基于 LLM 的深度评估
71+
🧠 **LLM、规则和智能体混合** - 结合快速启发式规则(30+ 内置规则)和基于 LLM 的深度评估
7272

7373
🚀 **灵活执行** - 本地运行快速迭代,或使用 Spark 扩展到数十亿级数据集
7474

@@ -383,6 +383,12 @@ input_data = {
383383
✅ 视觉语言模型(InternVL、Gemini)
384384
✅ 自定义 prompt 注册
385385

386+
**基于智能体** - 多步推理与工具
387+
✅ 网络搜索集成(Tavily)
388+
✅ 自适应上下文收集
389+
✅ 多源事实验证
390+
✅ 自定义智能体与工具注册
391+
386392
**可扩展架构**
387393
✅ 基于插件的规则/prompt/模型注册
388394
✅ 清晰的关注点分离(agents、tools、orchestration)
@@ -486,6 +492,65 @@ class MyCustomModel(BaseOpenAI):
486492
- [注册规则](examples/register/sdk_register_rule.py)
487493
- [注册模型](examples/register/sdk_register_llm.py)
488494

495+
### 智能体评估与工具
496+
497+
Dingo 支持基于智能体的评估器,可以使用外部工具进行多步推理和自适应上下文收集:
498+
499+
```python
500+
from dingo.io import Data
501+
from dingo.io.output.eval_detail import EvalDetail
502+
from dingo.model import Model
503+
from dingo.model.llm.agent.base_agent import BaseAgent
504+
505+
@Model.llm_register('MyAgent')
506+
class MyAgent(BaseAgent):
507+
"""支持工具的自定义智能体"""
508+
509+
available_tools = ["tavily_search", "my_custom_tool"]
510+
max_iterations = 5
511+
512+
@classmethod
513+
def eval(cls, input_data: Data) -> EvalDetail:
514+
# 使用工具进行事实核查
515+
search_result = cls.execute_tool('tavily_search', query=input_data.content)
516+
517+
# 使用LLM进行多步推理
518+
result = cls.send_messages([...])
519+
520+
return EvalDetail(...)
521+
```
522+
523+
**内置智能体:**
524+
- `AgentHallucination`: 增强的幻觉检测,支持网络搜索回退
525+
526+
**配置示例:**
527+
```json
528+
{
529+
"evaluator": [{
530+
"evals": [{
531+
"name": "AgentHallucination",
532+
"config": {
533+
"key": "openai-api-key",
534+
"model": "gpt-4",
535+
"parameters": {
536+
"agent_config": {
537+
"max_iterations": 5,
538+
"tools": {
539+
"tavily_search": {"api_key": "tavily-key"}
540+
}
541+
}
542+
}
543+
}
544+
}]
545+
}]
546+
}
547+
```
548+
549+
**了解更多:**
550+
- [智能体开发指南](docs/agent_development_guide.md)
551+
- [AgentHallucination 示例](examples/agent/agent_hallucination_example.py)
552+
- [AgentFactCheck LangChain示例](examples/agent/agent_executor_example.py)
553+
489554
## 执行引擎
490555

491556
### 本地执行

0 commit comments

Comments
 (0)