You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: md-docs/user_guide/modules/rag_evaluation.md
+19-3Lines changed: 19 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,22 +3,28 @@
3
3
## What is RAG Evaluation?
4
4
5
5
RAG (Retrieval-Augmented Generation) is a way of building AI models that enhances their ability to generate accurate and contextually relevant responses by combining two main steps: **retrieval** and **generation**.
6
+
6
7
1.**Retrieval**: The model first searches through a large set of documents or pieces of information to "retrieve" the most relevant ones based on the user query.
7
8
2.**Generation**: It then uses these retrieved documents as context to generate a response, which is typically more accurate and aligned with the question than if it had generated text from scratch without specific guidance.
8
9
9
10
Evaluating RAG involves assessing how well the model does in both retrieval and generation.
10
11
11
12
Our RAG evaluation module analyzes the three main components of a RAG framework:
13
+
12
14
-**User Input**: The query or question posed by the user.
13
-
-**Context**: The retrieved documents or information that the model uses to generate a response.
15
+
-**Context**: The retrieved documents or information that the model uses to generate a response. A context can consist of one or more chunks of text.
14
16
-**Response**: The generated answer or output provided by the model.
15
17
16
18
In particular, the analysis is performed on the relationships between these components:
The evaluation is performed through an LLM-as-a-Judge approach, where a Language Model (LM) acts as a judge to evaluate the quality of a RAG model.
24
30
@@ -37,4 +43,14 @@ Below are the metrics computed by the RAG evaluation module, divided into the th
37
43
-**Faithfulness**: Measures how much the response contradicts the retrieved context. A higher faithfulness score indicates that the response is more aligned with the context. The score ranges from 1 to 5, with 5 being the highest faithfulness.
38
44
39
45
### User Input - Response
40
-
-**Satisfaction**: Evaluates how satisfied the user would be with the generated response. The score ranges from 1 to 5, with 1 a response that does not address the user query and 5 a response that fully addresses and answers the user query.
46
+
-**Satisfaction**: Evaluates how satisfied the user would be with the generated response. The score ranges from 1 to 5, with 1 a response that does not address the user query and 5 a response that fully addresses and answers the user query.
47
+
48
+
## What is the required data?
49
+
50
+
The RAG evaluation module computes the metrics based on the data availability for each sample.
51
+
If a sample lacks one of the three components (User Input, Context or Response), only the applicable metrics are computed.
52
+
For instance, if a sample does not have a response, only the **User Input - Context** metrics are computed.
53
+
54
+
If data added to a [Task] contains contexts with multiple chunks of text, a [context separator](../task.md#retrieval-augmented-generation) must be provided.
On the left side of the web app page the Task menù is present, with links to the above mentioned modules and Task settings.
53
+
On the left side of the web app page the Task menu is present, with links to the above mentioned modules and Task settings.
54
54
55
55
## Task Type
56
56
@@ -140,9 +140,9 @@ Moreover, in this Task, the Prediction is a text as well and the input is compos
140
140
RAG tasks has additional the attribute *context separator* which is a string used to separate different retrieved contexts into chunks. Context data is sent as a single string, however, in RAG settings multiple documents can be retrieved. In this case, context separator is used to distinguish them. It is optional since a single context can be provided.
141
141
142
142
!!! example
143
-
Context separator: <<sep>>
143
+
Context separator: <<sep\>\>
144
144
145
-
Context data: The capital of Italy is Rome.<<sep>>Rome is the capital of Italy.<<sep>>Rome was the capital of Roman Empire.
145
+
Context data: The capital of Italy is Rome.<<sep\>\>Rome is the capital of Italy.<<sep\>\>Rome was the capital of Roman Empire.
0 commit comments