You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Docs: fix confusion regarding getting started (#815)
fixes: #290
- [x] added data preparation notebook
- [x] change order and introduce flowchat : thanks to
https://github.com/mtharrison
- [x] Move contexts from test set to metadata
Copy file name to clipboardExpand all lines: docs/getstarted/evaluation.md
+12-8Lines changed: 12 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,10 @@
3
3
4
4
Once your test set is ready (whether you've created your own or used the [synthetic test set generation module](get-started-testset-generation)), it's time to evaluate your RAG pipeline. This guide assists you in setting up Ragas as quickly as possible, enabling you to focus on enhancing your Retrieval Augmented Generation pipelines while this library ensures that your modifications are improving the entire pipeline.
This guide utilizes OpenAI for running some metrics, so ensure you have your OpenAI key ready and available in your environment.
7
11
8
12
```python
@@ -17,26 +21,26 @@ Let's begin with the data.
17
21
18
22
## The Data
19
23
20
-
For this tutorial, we'll use an example dataset from one of the baselines we created for the [Amnesty QA](https://huggingface.co/datasets/explodinggradients/amnesty_qa) dataset. The dataset contains the following columns:
24
+
For this tutorial, we'll use an example dataset that we created using example in [data preparation](./prepare_data.ipynb). The dataset contains the following columns:
21
25
22
26
- question: `list[str]` - These are the questions your RAG pipeline will be evaluated on.
23
27
- context: `list[list[str]]` - The contexts which were passed into the LLM to answer the question.
24
-
- ground_truth: `list[str]` - The ground truth answer to the questions.
28
+
- ground_truth: `str` - The ground truth answer to the questions.
29
+
- answer: `str` - The answer generated by the RAG pipeline.
25
30
26
31
An ideal test data set should contain samples that closely mirror your real-world use case.
0 commit comments