Update scenarios/evaluate/Supported_Evaluation_Metrics/RAG_Evaluation/README.md

chbradsh · changliu2 · web-flow · commit f466768ce3af · 2025-09-15T13:25:53.000-07:00
Co-authored-by: changliu2 &lt;99364750+changliu2@users.noreply.github.com&gt;
diff --git a/scenarios/evaluate/Supported_Evaluation_Metrics/RAG_Evaluation/README.md b/scenarios/evaluate/Supported_Evaluation_Metrics/RAG_Evaluation/README.md
@@ -26,7 +26,7 @@ This tutorial includes two notebooks as best practices to cover these important
 
 - [Evaluate and Optimize a RAG retrieval system end to end](https://aka.ms/knowledge-agent-eval-sample): Complex queries are a common scenario for advanced RAG retrieval systems. In both principle and practice, [agentic RAG](aka.ms/agentRAG) is an advanced RAG pattern compared to traditional RAG patterns in agentic scenarios. By using the Agentic Retrieval API in Azure AI Search in Azure AI Foundry, we observe [up to 40% better relevance for complex queries than our baselines](https://techcommunity.microsoft.com/blog/Azure-AI-Services-blog/up-to-40-better-relevance-for-complex-queries-with-new-agentic-retrieval-engine/4413832/). After onboarding to agentic retrieval, it's a best practice to evaluate the end-to-end response of the RAG system with [Groundedness](http://aka.ms/groundedness-doc) and [Relevance](http://aka.ms/relevance-doc) evaluators. With the ability to assess the end-to-end quality for one set of RAG parameter, you can perform "parameter sweep" for another set to finetune and optimize the parameters for the agentic retrieval pipeline.
 
-- [Parameter Sweep: evaluating and optimizing RAG document retrieval quality](https://aka.ms/doc-retrieval-sample): Document retrieval quality is a common bottleneck in RAG workflows. To address this, one best practice is to optimize your RAG search parameters according to your enterprise data. For advanced scenarios where you can curate ground-truth relevance labels for document retrieval results (commonly called qrels), it’s a best practice to "sweep" and optimize the parameters by evaluating the document retrieval quality using golden metrics such as [NDCG](https://en.wikipedia.org/wiki/Discounted_cumulative_gain).
+- [Evaluate and Optimize RAG document retrieval quality](https://aka.ms/doc-retrieval-sample): Document retrieval quality is a common bottleneck in RAG workflows. To address this, one best practice is to optimize your RAG search retrieval parameters according to your enterprise data using golden metrics such as [NDCG](https://en.wikipedia.org/wiki/Discounted_cumulative_gain). This is an advanced scenario where you can curate ground-truth relevance labels for document retrieval results (commonly called qrels) through human subject matter experts or AI-assisted tools such as Github Copilot or using [Relevance](http://aka.ms/relevance-doc) evaluator to label each document. After successful curation of such input data, you can perform "parameter sweep" to finetune and optimize the parameters by evaluating the document retrieval quality using golden metrics and per-document labels for more precise measurements.
 
 
 ### Objective