invariantlabs-ai
diff --git a/‎docs/testing/Writing_Tests/Matchers.md‎
Lines changed: 4 additions & 2 deletions b/‎docs/testing/Writing_Tests/Matchers.md‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎docs/testing/Writing_Tests/parameterized-tests.md‎
Lines changed: 47 additions & 0 deletions b/‎docs/testing/Writing_Tests/parameterized-tests.md‎
Lines changed: 47 additions & 0 deletions
diff --git a/‎docs/testing/Writing_Tests/2_Tests.md‎ ‎docs/testing/Writing_Tests/tests.md‎docs/testing/Writing_Tests/2_Tests.md renamed to docs/testing/Writing_Tests/tests.md b/‎docs/testing/Writing_Tests/2_Tests.md‎ ‎docs/testing/Writing_Tests/tests.md‎docs/testing/Writing_Tests/2_Tests.md renamed to docs/testing/Writing_Tests/tests.md
diff --git a/‎docs/testing/Writing_Tests/1_Traces.ipynb‎ ‎docs/testing/Writing_Tests/traces.ipynb‎docs/testing/Writing_Tests/1_Traces.ipynb renamed to docs/testing/Writing_Tests/traces.ipynb b/‎docs/testing/Writing_Tests/1_Traces.ipynb‎ ‎docs/testing/Writing_Tests/traces.ipynb‎docs/testing/Writing_Tests/1_Traces.ipynb renamed to docs/testing/Writing_Tests/traces.ipynb
diff --git a/‎docs/testing/assets/parameterized_tests.png‎
108 KB b/‎docs/testing/assets/parameterized_tests.png‎
108 KB
diff --git a/‎docs/testing/index.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/testing/index.md‎
Lines changed: 2 additions & 2 deletions
@@ -1,11 +1,13 @@
 # Matchers
 
-<div class='subtitle'>Use matchers for fuzzy and LLM-based checks</div>
+<div class='subtitle'>Test with custom checkers and LLM-based evaluation</div>
 
-Not all agentic behavior can be specified with precise, traditional checking methods. Instead, more often than not, we expect AI models to generalize and thus respond slightly differently to different inputs.
+Not all agentic behavior can be specified with precise, traditional checking methods. Instead, more often than not, we expect AI models to generalize and thus respond slightly differently everytime we invoke them.
 
 To accommodate this, `testing` includes several different `Matcher` implementations, that allow you to write tests that rely on fuzzy, similarity-based or property-based conditions.
 
+Beyond that, `Matcher` is also a simple base class that allows you to write your own custom matchers, if the provided ones are not sufficient for your needs (e.g. custom properties).
+
 ## `IsSimilar`
 
 TODO
 
@@ -0,0 +1,47 @@
+# Parameterized Tests
+
+<div class='subtitle'>Use parameterized tests to test multiple scenarios</div>
+
+In some cases, a certain agent functionality should generalize to multiple scenarios. For example, a weather agent should be able to answer questions about the weather in different cities. 
+
+In `testing`, instead of writing a separate test for each city, you can use parameterized tests to test multiple scenarios. This ensures robustness and generalization of your agent's behavior.
+
+```python
+from invariant.testing import Trace, assert_equals, parameterized
+import pytest
+
+@pytest.mark.parametrize(
+    ("city",),
+    [
+        ("Paris",),
+        ("London",),
+        ("New York",),
+    ]
+)
+def test_check_weather_in(city: str):
+    # create a Trace object from your agent trajectory
+    trace = Trace(
+        trace=[
+            {"role": "user", "content": f"What is the weather like in {city}"},
+            {"role": "agent", "content": f"The weather in {city} is 75°F and sunny."},
+        ]
+    )
+
+    # make assertions about the agent's behavior
+    with trace.as_context():
+        # extract the locations mentioned in the agent's response
+        locations = trace.messages()[-1]["content"].extract("locations")
+
+        # assert that the agent responded about the given city
+        assert_equals(
+            1, len(locations), "The agent should respond about one location only"
+        )
+
+        assert_equals(city, locations[0], "The agent should respond about " + city)
+```
+
+### Visualization
+
+When pushing the parameterized test results to Explorer (`invariant test --push`), the resulting test instances will be listed separately:
+
+<img src="../../assets/parameterized_tests.png"/>
@@ -73,7 +73,7 @@ ________________________________________________________________________________
 #     },
 #  ]
 ```
-The test result provides information about which assertion failed but also [localizes the assertion failure precisely](Writing_Tests/1_Traces.ipynb) in the provided list of agent messages.
+The test result provides information about which assertion failed but also [localizes the assertion failure precisely](./Writing_Tests/tests.md) in the provided list of agent messages.
 
 **Visual Test Viewer (Explorer):**
 
@@ -92,7 +92,7 @@ Like the terminal output, the Explorer highlights the relevant ranges, but does
 * Comprehensive [`Trace` API](Writing_Tests/1_Traces.ipynb) for easily navigating and checking agent traces.
 * [Assertions library](Writing_Tests/2_Assertions.md) to check agent behavior, including fuzzy checkers such as _Levenshtein distance_, _semantic similarity_ and _LLM-as-a-judge_ pipelines.
 * Full [`pytest` compatibility](Running_Tests/PyTest_Compatibility.md) for easy integration with existing test and CI/CD pipelines.
-* Parameterized tests for [testing multiple scenarios](Writing_Tests/3_Parameterized_Tests.md) with a single test function.
+* Parameterized tests for [testing multiple scenarios](Writing_Tests/parameterized-tests) with a single test function.
 * [Visual test viewer](Writing_Tests/4_Visual_Test_Viewer.md) for exploring large traces and debugging test failures.
 
 ## Next Steps