Remove experimental from docs and fix examples in docs (#2270)

sanjeed5 · web-flow · commit 83d25e3ce07a · 2025-09-16T19:33:38.000-07:00
### Summary
Remove “Experimental” docs and migrate the hello-world content into a
core Get Started tutorial.

### Changes
- Added `docs/getstarted/experiments_quickstart.md` (tutorial: Run your
first experiment).
- Moved assets to `docs/_static/imgs/experiments_quickstart/` and
updated image links to `/_static/...`.
- Updated `mkdocs.yml`: removed `🧪 Experimental` section; added “Run
your first experiment” under Get Started.
- Linked from `docs/getstarted/index.md` to the new tutorial.
- Cross-linked from `docs/concepts/experimentation.md` to the new
tutorial.
- Removed the entire `docs/experimental/` directory.


### Verification
- Built docs locally succeeds.
- New tutorial renders; Experimental tab is gone; links and assets
resolve.
diff --git a/docs/_static/imgs/experiments_quickstart/hello_world.gif b/docs/_static/imgs/experiments_quickstart/hello_world.gif
diff --git a/docs/_static/imgs/experiments_quickstart/output_first_experiment.png b/docs/_static/imgs/experiments_quickstart/output_first_experiment.png
diff --git a/docs/concepts/experimentation.md b/docs/concepts/experimentation.md
@@ -36,7 +36,7 @@ graph LR
 
 ## Creating Experiments with Ragas
 
-Ragas provides an `@experiment` decorator to streamline the experiment creation process:
+Ragas provides an `@experiment` decorator to streamline the experiment creation process. If you prefer a hands-on intro first, see [Run your first experiment](../getstarted/experiments_quickstart.md).
 
 ### Basic Experiment Structure
 
@@ -149,17 +149,6 @@ return {
 }
 ```
 
-### 4. Comparative Analysis
-
-Use the built-in comparison tools to analyze results:
-
-```python
-# Compare two experiments
-comparison_results = compare_experiments(
-    ["experiments/exp1.csv", "experiments/exp2.csv"]
-)
-```
-
 ## Advanced Experiment Patterns
 
 ### A/B Testing
diff --git a/docs/experimental/core_concepts/index.md b/docs/experimental/core_concepts/index.md
diff --git a/docs/experimental/howtos/index.md b/docs/experimental/howtos/index.md
diff --git a/docs/getstarted/experiments_quickstart.md b/docs/getstarted/experiments_quickstart.md
@@ -1,74 +1,55 @@
-# Ragas Experimental
+# Run your first experiment
 
-A framework for applying Evaluation-Driven Development (EDD) to AI applications.
+This tutorial walks you through running your first experiment with Ragas using the `@experiment` decorator and a local CSV backend.
 
-The goal of Ragas Experimental is to evolve Ragas into a general-purpose evaluation framework for AI applications. It helps teams design, run, and reason about evaluations across any AI workflow. Beyond tooling, it provides a mental model for thinking about evaluations not just as a diagnostic tool, but as the backbone of iterative improvement.
-
-# ✨ Introduction
-
-
-<div class="grid cards" markdown>
-
-- 🚀 **Tutorials**
-
-    Step-by-step guides to help you get started with Ragas Experimental. Learn how to evaluate AI applications like RAGs and agents with practical examples.
-
-    [:octicons-arrow-right-24: Tutorials](tutorials/index.md)
-
-- 📚 **Core Concepts**
-
-    A deeper dive into the principles of evaluation and how Ragas Experimental supports evaluation-driven development for AI applications.
-
-    [:octicons-arrow-right-24: Core Concepts](core_concepts/index.md)
-
-- 🛠️ **How-to Guides**
-
-    Step-by-step guides for specific tasks and goals using Ragas experimental features.
-
-    [:octicons-arrow-right-24: How-to Guides](howtos/index.md)
-
-</div>
+## Prerequisites
 
+- Python 3.9+
+- Ragas installed (see [Installation](./install.md))
 
 ## Hello World 👋
 
-![](hello_world.gif)
+![](/_static/imgs/experiments_quickstart/hello_world.gif)
 
-1\. Install Ragas Experimental with local backend
+### 1. Install (if you haven’t already)
 
 ```bash
-pip install ragas-experimental && pip install "ragas-experimental[local]"
+pip install ragas
 ```
 
-2\. Copy this snippet to a file named `hello_world.py` and run `python hello_world.py` 
+### 2. Create `hello_world.py`
 
+Copy this into a new file and save as `hello_world.py`:
 
 ```python
 import numpy as np
-from ragas_experimental import Dataset
-from ragas import experiment
-from ragas.metrics import MetricResult, discrete_metric  
+from ragas import Dataset, experiment
+from ragas.metrics import MetricResult, discrete_metric
+
 
 # Define a custom metric for accuracy
 @discrete_metric(name="accuracy_score", allowed_values=["pass", "fail"])
 def accuracy_score(response: str, expected: str):
     result = "pass" if expected.lower().strip() == response.lower().strip() else "fail"
     return MetricResult(value=result, reason=f"Match: {result == 'pass'}")
 
+
 # Mock application endpoint that simulates an AI application response
 def mock_app_endpoint(**kwargs) -> str:
     return np.random.choice(["Paris", "4", "Blue Whale", "Einstein", "Python"])
 
+
 # Create an experiment that uses the mock application endpoint and the accuracy metric
 @experiment()
 async def run_experiment(row):
     response = mock_app_endpoint(query=row.get("query"))
     accuracy = accuracy_score.score(response=response, expected=row.get("expected_output"))
     return {**row, "response": response, "accuracy": accuracy.value}
 
+
 if __name__ == "__main__":
     import asyncio
-    
+
     # Create dataset inline
     dataset = Dataset(name="test_dataset", backend="local/csv", root_dir=".")
     test_data = [
@@ -78,22 +59,22 @@ if __name__ == "__main__":
         {"query": "Who developed the theory of relativity?", "expected_output": "Einstein"},
         {"query": "What programming language is named after a snake?", "expected_output": "Python"},
     ]
-    
+
     for sample in test_data:
         dataset.append(sample)
     dataset.save()
-    
+
     # Run experiment
-    results = asyncio.run(run_experiment.arun(dataset, name="first_experiment"))
+    _ = asyncio.run(run_experiment.arun(dataset, name="first_experiment"))
 ```
 
-3\. Check your current directory structure to see the created dataset and experiment results.
+### 3. Inspect the generated files
 
 ```bash
 tree .
 ```
 
-Output:
+You should see:
 
 ```
 ├── datasets
@@ -102,12 +83,18 @@ Output:
     └── first_experiment.csv
 ```
 
-4\. View the results of your first experiment
+### 4. View the results of your first experiment
 
 ```bash
 open experiments/first_experiment.csv
 ```
 
-Output:
+Output preview:
+
+![](/_static/imgs/experiments_quickstart/output_first_experiment.png)
+
+## Next steps
+
+- Learn the concepts behind experiments in [Experiments (Concepts)](../concepts/experimentation.md)
+- Explore evaluation metrics in [Metrics](../concepts/metrics/index.md)
 
-![](output_first_experiment.png)
diff --git a/docs/getstarted/index.md b/docs/getstarted/index.md
@@ -1,6 +1,6 @@
 # 🚀 Get Started
 
-Welcome to the Ragas tutorials! If you're new to Ragas, the Get Started guides will walk you through the fundamentals of working with Ragas. These tutorials assume basic knowledge of Python and building LLM application pipelines. 
+Welcome to Ragas! If you're new to Ragas, the Get Started guides will walk you through the fundamentals of working with Ragas. These tutorials assume basic knowledge of Python and building LLM application pipelines. 
 
 Before you proceed further, ensure that you have [Ragas installed](./install.md)!
 
@@ -13,4 +13,5 @@ Let's get started!
 
 - [Evaluate your first AI app](./evals.md)
 - [Run ragas metrics for evaluating RAG](rag_eval.md)
-- [Generate test data for evaluating RAG](rag_testset_generation.md)
+- [Generate test data for evaluating RAG](rag_testset_generation.md)
+- [Run your first experiment](experiments_quickstart.md)
diff --git a/examples/iterate_prompt/evals.py b/examples/iterate_prompt/evals.py
@@ -10,8 +10,7 @@
 from run_prompt import run_prompt
 
 from ragas import Dataset, experiment
-from ragas.experimental.metrics.discrete import discrete_metric
-from ragas.experimental.metrics.result import MetricResult
+from ragas.metrics import MetricResult, discrete_metric
 
 
 @discrete_metric(name="labels_exact_match", allowed_values=["correct", "incorrect"])
diff --git a/examples/ragas_examples/agent_evals/agent.py b/examples/ragas_examples/agent_evals/agent.py
@@ -303,7 +303,7 @@ def solve(
                 )
 
                 message = response.choices[0].message
-                messages.append(message.dict())
+                messages.append(message.model_dump())
 
                 self.traces.append(
                     TraceEvent(
diff --git a/examples/ragas_examples/agent_evals/evals.py b/examples/ragas_examples/agent_evals/evals.py
@@ -1,4 +1,4 @@
-from agent import get_default_agent
+from .agent import get_default_agent
 
 from ragas import Dataset, experiment
 from ragas.metrics.numeric import numeric_metric
diff --git a/examples/ragas_examples/benchmark_llm/evals.py b/examples/ragas_examples/benchmark_llm/evals.py
@@ -7,7 +7,6 @@
 from typing import List, Optional
 
 import pandas as pd
-from prompt import DEFAULT_MODEL, run_prompt
 
 from ragas import experiment
 from ragas.dataset import Dataset
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -15,6 +15,12 @@ nav:
       - Evaluate your first LLM App: getstarted/evals.md
       - Evaluate a simple RAG: getstarted/rag_eval.md
       - Generate Synthetic Testset for RAG: getstarted/rag_testset_generation.md
+      - Experiments:
+          - Run your first experiment: getstarted/experiments_quickstart.md
+          - Evaluate a prompt: tutorials/prompt.md
+          - Evaluate a simple RAG system: tutorials/rag.md
+          - Evaluate an AI Workflow: tutorials/workflow.md
+          - Evaluate an AI Agent: tutorials/agent.md
   - 📚 Core Concepts:
       - concepts/index.md
       - Components:
@@ -79,17 +85,7 @@ nav:
           - concepts/feedback/index.md
       - Datasets: concepts/datasets.md
       - Experimentation: concepts/experimentation.md
-  - 🧪 Experimental:
-    - Overview: experimental/index.md
-    - Tutorials:
-        - experimental/tutorials/index.md
-        - Prompt: experimental/tutorials/prompt.md
-        - RAG: experimental/tutorials/rag.md
-        - Workflow: experimental/tutorials/workflow.md
-        - Agent: experimental/tutorials/agent.md
-
-    - How-to Guides:
-        - experimental/howtos/index.md
+  
   - 🛠️ How-to Guides:
       - howtos/index.md
       - Customizations:

Original file line number	Diff line number	Diff line change
`@@ -303,7 +303,7 @@ def solve(`
`303`	`303`	`)`
`304`	`304`
`305`	`305`	`message = response.choices[0].message`
`306`		`- messages.append(message.dict())`
	`306`	`+ messages.append(message.model_dump())`
`307`	`307`
`308`	`308`	`self.traces.append(`
`309`	`309`	`TraceEvent(`
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-from agent import get_default_agent`
	`1`	`+from .agent import get_default_agent`
`2`	`2`
`3`	`3`	`from ragas import Dataset, experiment`
`4`	`4`	`from ragas.metrics.numeric import numeric_metric`