Skip to content

Commit 71f882f

Browse files
committed
chore: Update dashboard description
1 parent d027631 commit 71f882f

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

agents_mcp_usage/multi_mcp/eval_multi_mcp/dashboard_config.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,8 @@
2525
"\n\nMerbench tests this ability by providing an LLM Agent access to an MCP server that both validates "
2626
"and provides error messages to guide correction of syntax. There are three different difficulty levels (test cases), "
2727
"and the LLM is given a fixed number of attempts to fix the diagram, if this is exceeded, the test case is considered failed. "
28-
"\n\nThis leaderboard shows the average success rate across all selected models and difficulty levels."
28+
"\n\n **Performance is a measure of both tool usage, and Mermaid syntax understanding.**"
29+
"\n\nThe leaderboard shows the average success rate across all selected models and difficulty levels over *n runs*."
2930
),
3031
"icon": "🧜‍♀️", # Emoji for the browser tab
3132
# --- Primary Metric Configuration ---

0 commit comments

Comments
 (0)