fix(docs): remove useless code in evaluation.md (#206)

yaozheng-fang · web-flow · commit 80d6b282b01b · 2025-09-30T17:53:52.000+08:00
* Update 2.evaluation.md * Update 4.troubleshooting.md (#207)
diff --git a/docs/content/1.introduction/4.troubleshooting.md b/docs/content/1.introduction/4.troubleshooting.md
@@ -28,8 +28,8 @@ navigation:
    - ![开通权限](/images/troubleshooting-01.png)
 
    - 新账号开通后，缺少ServerlessApplicationRole授权
-      - 当前，你可以随便点进一个创建应用的页面，（比如[这个页面](https://console.volcengine.com/vefaas/region:vefaas+cn-beijing/application/create?templateId=67f7b4678af5a6000850556c)）点击「一键授权」即可。
-   - ![角色授权](/images/troubleshooting-02.png)
+      - 前往火山引擎函数服务官网，进入创建应用页面（例如[这里](https://console.volcengine.com/vefaas/region:vefaas+cn-beijing/application/create?templateId=67f7b4678af5a6000850556c)）点击「一键授权」即可
+      - ![角色授权](/images/troubleshooting-02.png)
 
 2. **安装依赖失败，显示依赖安装空间不足**
    - VeFaaS 最大依赖安装大小默认为 250 MB，若需更大空间，请联系 VeFaaS 产品团队扩容。
diff --git a/docs/content/8.observation/2.evaluation.md b/docs/content/8.observation/2.evaluation.md
@@ -154,64 +154,3 @@ evaluator = DeepevalEvaluator(
     prometheus_config=prometheus_config,
 )
 ```
-
-## 完整示例
-
-以下是使用 DeepEval 评测器的完整例子。其中定义了 [GEval](https://deepeval.com/docs/metrics-llm-evals) 指标和 [ToolCorrectnessMetric](https://deepeval.com/docs/metrics-tool-correctness) 指标，分别用于整体输出质量评估和工具调用正确率评估，并将评测结果上报至火山引擎的 VMP 平台：
-
-```python
-import asyncio
-import os
-from builtin_tools.agent import agent
-
-from deepeval.metrics import GEval, ToolCorrectnessMetric
-from deepeval.test_case import LLMTestCaseParams
-from veadk.config import getenv
-from veadk.evaluation.deepeval_evaluator import DeepevalEvaluator
-from veadk.evaluation.utils.prometheus import PrometheusPushgatewayConfig
-from veadk.prompts.prompt_evaluator import eval_principle_prompt
-
-prometheus_config = PrometheusPushgatewayConfig()
-
-# 1. Rollout, and generate eval set file
-# await agent.run(
-#     prompt,
-#     collect_runtime_data=True,
-#     eval_set_id=f"eval_demo_set_{get_current_time()}",
-# )
-# # get expect output
-# dump_path = agent._dump_path
-# assert dump_path != "", "Dump eval set file failed! Please check runtime logs."
-
-# 2. Evaluate in terms of eval set file
-evaluator = DeepevalEvaluator(
-    agent=agent,
-    judge_model_name=getenv("MODEL_JUDGE_NAME"),
-    judge_model_api_base=getenv("MODEL_JUDGE_API_BASE"),
-    judge_model_api_key=getenv("MODEL_JUDGE_API_KEY"),
-    prometheus_config=prometheus_config,
-)
-
-# 3. Define evaluation metrics
-metrics = [
-    GEval(
-        threshold=0.8,
-        name="Base Evaluation",
-        criteria=eval_principle_prompt,
-        evaluation_params=[
-            LLMTestCaseParams.INPUT,
-            LLMTestCaseParams.ACTUAL_OUTPUT,
-            LLMTestCaseParams.EXPECTED_OUTPUT,
-        ],
-    ),
-    ToolCorrectnessMetric(
-        threshold=0.5
-    ), 
-]
-
-# 4. Run evaluation
-eval_set_file_path = os.path.join(
-    os.path.dirname(__file__), "builtin_tools", "evalsetf0aef1.evalset.json"
-)
-await evaluator.eval(eval_set_file_path=eval_set_file_path, metrics=metrics)
-```