explodinggradients
diff --git a/‎docs/_static/imgs/ragas_workflow_white.png‎
405 KB b/‎docs/_static/imgs/ragas_workflow_white.png‎
405 KB
diff --git a/‎docs/_static/imgs/testset_output.png‎
50.9 KB b/‎docs/_static/imgs/testset_output.png‎
50.9 KB
diff --git a/‎docs/getstarted/evaluation.md‎
Lines changed: 12 additions & 8 deletions b/‎docs/getstarted/evaluation.md‎
Lines changed: 12 additions & 8 deletions
diff --git a/‎docs/getstarted/index.md‎
Lines changed: 10 additions & 1 deletion b/‎docs/getstarted/index.md‎
Lines changed: 10 additions & 1 deletion
@@ -3,6 +3,10 @@
 
 Once your test set is ready (whether you've created your own or used the [synthetic test set generation module](get-started-testset-generation)), it's time to evaluate your RAG pipeline. This guide assists you in setting up Ragas as quickly as possible, enabling you to focus on enhancing your Retrieval Augmented Generation pipelines while this library ensures that your modifications are improving the entire pipeline.
 
+<p align="left">
+<img src="../_static/imgs/ragas_workflow_white.png" alt="test-outputs" width="800" height="600" />
+</p>
+
 This guide utilizes OpenAI for running some metrics, so ensure you have your OpenAI key ready and available in your environment.
 
 ```python
@@ -17,26 +21,26 @@ Let's begin with the data.
 
 ## The Data
 
-For this tutorial, we'll use an example dataset from one of the baselines we created for the [Amnesty QA](https://huggingface.co/datasets/explodinggradients/amnesty_qa) dataset. The dataset contains the following columns:
+For this tutorial, we'll use an example dataset that we created using example in [data preparation](./prepare_data.ipynb). The dataset contains the following columns:
 
 - question: `list[str]` - These are the questions your RAG pipeline will be evaluated on.
 - context: `list[list[str]]` - The contexts which were passed into the LLM to answer the question.
-- ground_truth: `list[str]` - The ground truth answer to the questions.
+- ground_truth: `str` - The ground truth answer to the questions.
+- answer: `str` - The answer generated by the RAG pipeline.
 
 An ideal test data set should contain samples that closely mirror your real-world use case.
 
 ```{code-block} python
 :caption: import sample dataset
 from datasets import load_dataset
 
-# loading the V2 dataset
-amnesty_qa = load_dataset("explodinggradients/amnesty_qa", "english_v2")
-amnesty_qa
+dataset = load_dataset("explodinggradients/prompt-engineering-guide-papers","test_data")
+dataset["test"]
 ```
 
 :::{seealso}
-See [test set generation](./testset_generation.md) to learn how to generate your own `Question/Context/Ground_Truth` triplets for evaluation.
-See [preparing your own dataset](/docs/howtos/applications/data_preparation.md) to learn how to prepare your own dataset for evaluation.
+See [test set generation](./testset_generation.md) to learn how to generate your own `Question/Ground_Truth` pairs for evaluation.
+See [dataset preparation](./prepare_data.ipynb) to learn how to prepare your own dataset for evaluation.
 :::
 
 ## Metrics
@@ -77,7 +81,7 @@ Running the evaluation is as simple as calling `evaluate` on the `Dataset` with
 from ragas import evaluate
 
 result = evaluate(
-    amnesty_qa["eval"],
+    dataset["eval"],
     metrics=[
         context_precision,
         faithfulness,
 
@@ -6,6 +6,7 @@
 :hidden:
 install.md
 testset_generation.md
+prepare_data.md
 evaluation.md
 monitoring.md
 :::
@@ -26,7 +27,15 @@ Let's get started!
 :link: get-started-testset-generation
 :link-type: ref
 
-Learn how to generate `Question/Context/Ground_Truth` triplets to get started.
+Learn how to generate high quality and diverse `Question/Ground_Truth` pairs to get started.
+:::
+
+:::{card} Prepare data for evaluation
+:link: dataset-preparation
+:link-type: ref
+
+Learn how to prepare a complete test dataset for evaluating using ragas metrics.
+
 :::
 
 :::{card} Evaluate Using Your Testset