voice and chat testing (#280)

abizzaar · web-flow · commit d934ec8fed40 · 2025-03-27T21:50:54.000-07:00
diff --git a/fern/docs.yml b/fern/docs.yml
@@ -349,6 +349,10 @@ navigation:
         contents:
           - page: Test Suites
             path: test/test-suites.mdx
+          - page: Chat Testing
+            path: test/chat-testing.mdx
+          - page: Voice Testing
+            path: test/voice-testing.mdx
 
       - section: Deploy
         collapsed: true
diff --git a/fern/test/chat-testing.mdx b/fern/test/chat-testing.mdx
@@ -0,0 +1,49 @@
+---
+title: Chat Testing
+subtitle: Automated text-based testing for AI agents
+slug: /test/chat-testing
+---
+
+## Overview
+
+Chat Test Suites allow you to evaluate your AI agents through simulated text conversations. This is our recommended solution for testing as it is much faster than voice testing and lets you isolate testing the behavior of your agent. 
+
+## How Chat Testing Works
+
+1. **Simulation:** Our AI tester engages with your agent in a text-based conversation.
+2. **Scripted Interaction:** The testing agent follows your predefined script to simulate specific customer scenarios.
+3. **Transcript Capture:** The conversation is captured as a transcript.
+4. **Evaluation:** A language model (LLM) assesses the transcript against your success criteria.
+
+## Designing your tests
+
+Good test design is critical to evaluating your agent. You'll want to consider testing:
+
+1. The tool calls of your agent. Set your script to schedule an appointment or call a transfer tool. At the evaluation step, your rubric will have context of the tool call history to evaluate success.
+2. Knowledge base integrations. Test different Q&A to make sure that your agent responds as expected.
+3. Legal / compliance issues. Ask the agent to answer things it's not supposed to, and verify that it refuses to answer.
+4. Personality. Simulate an angry, frustrated or manipulative customer, and make sure your assistant handles the situation well.
+
+## Benefits of Chat Testing
+
+- **Speed:** Chat tests execute faster than voice tests, allowing for rapid iteration.
+- **Cost-Effective:** No TTS or STT models are used during chat testing.
+- **Focused Assessment:** Evaluate pure conversational ability without audio-related variables.
+- **Higher Test Volume:** Run more tests in less time to ensure comprehensive coverage.
+
+## Creating Chat Tests
+
+You can create chat tests as part of a Test Suite:
+
+1. Navigate to the **Test** tab and select **Test Suites**.
+2. Create a new Test Suite or edit an existing one.
+3. When adding tests, select **Chat** as the test type.
+4. Define your script and success criteria as detailed in the [Test Suites](./test-suites) documentation.
+
+## Best Practices for Chat Testing
+
+- Use chat tests for rapid iteration during development.
+- Create variations of the same scenario to test different user inputs.
+- Test edge cases and potential misunderstandings.
+
+For comprehensive instructions on creating and managing test suites that include chat tests, refer to the [Test Suites](./test-suites) documentation.
diff --git a/fern/test/voice-testing.mdx b/fern/test/voice-testing.mdx
@@ -0,0 +1,39 @@
+---
+title: Voice Testing
+subtitle: Automated voice call testing for AI voice agents
+slug: /test/voice-testing
+---
+
+## Overview
+
+Voice Test Suites enable you to test your AI voice agents through simulated phone conversations. Our platform connects two AI agents - your voice agent and our testing agent - on a real phone call, following your predefined scripts to evaluate performance under various scenarios.
+
+## How Voice Testing Works
+
+1. **Simulation:** Our AI tester calls your voice agent and follows a script that simulates real customer behavior.
+2. **Conversation:** Both AIs engage in a natural voice conversation, with the tester following your script guidelines.
+3. **Recording:** The entire call is recorded and transcribed for evaluation.
+4. **Assessment:** After the call, the transcript is evaluated against your rubric by a language model (LLM).
+
+## Benefits of Voice Testing
+
+- **Natural Interaction:** Test your voice agent in the most realistic scenario - actual phone calls.
+- **Audio Quality Assessment:** Evaluate not just responses but also voice clarity, tone, and cadence.
+- **End-to-End Verification:** Confirm that your entire voice pipeline works correctly from telephony to response.
+
+## Creating Voice Tests
+
+You can create voice tests as part of a Test Suite:
+
+1. Navigate to the **Test** tab and select **Test Suites**.
+2. Create a new Test Suite or edit an existing one.
+3. When adding tests, select **Voice** as the test type.
+4. Define your script and success criteria as detailed in the [Test Suites](./test-suites) documentation.
+
+## Voice Test Limitations
+
+- Voice tests require more time to execute compared to chat tests.
+- Each test consumes calling minutes from your account.
+- Maximum call duration is limited to 15 minutes per test.
+
+For detailed instructions on creating and managing test suites that include voice tests, see the [Test Suites](./test-suites) documentation.