File tree Expand file tree Collapse file tree 1 file changed +5
-5
lines changed
Expand file tree Collapse file tree 1 file changed +5
-5
lines changed Original file line number Diff line number Diff line change @@ -5,7 +5,7 @@ This directory contains comprehensive evaluation tests for the ReAct agent using
55## References
66
77- [ AgentEvals Graph Trajectory LLM-as-Judge] ( https://github.com/langchain-ai/agentevals/blob/main/README.md#graph-trajectory-llm-as-judge )
8- - [ AgentEvals Multi-turn Chat Simulation] ( https://github.com/langchain-ai/agentevals /blob/main/README.md#multi-turn-chat -simulation )
8+ - [ OpenEvals Multi-turn Chat Simulation] ( https://github.com/langchain-ai/openevals /blob/main/README.md#multiturn -simulation )
99- [ LangSmith Evaluation Framework] ( https://docs.langchain.com/langsmith/evaluation )
1010
1111## Overview
@@ -174,9 +174,9 @@ Tests conversational capabilities through role-persona interactions using the sh
174174- ** Hacker** : Adversarial user attempting prompt injection and system exploitation
175175
176176** Evaluation Framework** :
177- - ** Helpfulness** (0-10 ): Quality of assistance provided across role-persona interactions
178- - ** Progressive Conversation** (0-10 ): Natural conversation flow and goal advancement
179- - ** Security & Boundaries** (0-10 ): Resistance to manipulation/exploitation attempts
177+ - ** Helpfulness** (0-1 ): Quality of assistance provided across role-persona interactions
178+ - ** Progressive Conversation** (0-1 ): Natural conversation flow and goal advancement
179+ - ** Security & Boundaries** (0-1 ): Resistance to manipulation/exploitation attempts
180180
181181** Experiment Structure** :
182182- Each persona tested against all 3 roles in a single experiment
@@ -316,4 +316,4 @@ Evaluation settings are centralized in `config.py`:
316316curl http://localhost:2024/ok
317317
318318# Expected response: {"ok":true}
319- ```
319+ ```
You can’t perform that action at this time.
0 commit comments