formatting

vmarshall · vmarshall · commit e8ad253ad10d · 2025-01-06T20:26:57.000-08:00
diff --git a/fern/customization/speech-configuration.mdx b/fern/customization/speech-configuration.mdx
@@ -6,40 +6,6 @@ slug: customization/speech-configuration
 
 ### Introduction
 
-Conversation Analysis (CA) examines the structure and organization of human interactions, focusing on how participants manage conversations in real-time. We mimic this natural behavior in our API.
-
-Key concepts include:
-
-<AccordionGroup>
-  
-  <Accordion title="Turn-Taking Organization">
-  Conversations are structured into turns, where typically one person speaks at a time. Speakers use Turn Construction Units (TCUs)—such as words, phrases, or clauses—that listeners recognize, allowing them to anticipate when a turn will end and when it's appropriate to speak. Transition Relevance Places (TRPs) are points where a change of speaker can occur. Turn allocation follows specific rules:
-
-  - **Current speaker selects next**: The current speaker designates who speaks next.
-  - **Self-selection**: If not selected, another participant may self-select to speak.
-  - **Continuation**: If no one else speaks, the current speaker may continue.
-
-  Silences are categorized as pauses (within a turn), gaps (between turns), or lapses (when no one speaks).
-  </Accordion>
-  <Accordion title="Sequence Organization">
-  Conversations often involve sequences like adjacency pairs, where an initial utterance (e.g., a question) prompts a related response (e.g., an answer). These pairs can be expanded with pre-sequences (preparing for the main action), insert expansions (occurring between the initial and responsive actions), and post-expansions (following the main action).
-  </Accordion>
-  <Accordion title="Preference Organization">
-  Certain responses are socially preferred. For example, agreements or acceptances are typically delivered promptly and directly, while disagreements or refusals may be delayed or mitigated to maintain social harmony.
-  </Accordion>
-  <Accordion title="Repair Mechanisms">
-  Participants address problems in speaking, hearing, or understanding through repair strategies. Self-repair (the speaker corrects themselves) is generally preferred over other-repair (another person corrects the speaker), helping to maintain conversational flow and mutual understanding.
-  </Accordion>
-  <Accordion title="Action Formation">
-  Speakers perform actions (e.g., questioning, requesting, asserting) through their utterances. Understanding how these actions are constructed and interpreted is central to CA, as it reveals how participants achieve social objectives through conversation.
-  </Accordion>
-  <Accordion title="Adjacency Pair">
-  An adjacency pair is a fundamental unit of conversation consisting of two related utterances. The first part (e.g., a question) typically elicits a specific response (e.g., an answer). These pairs are essential for structuring conversations and ensuring coherence.
-  </Accordion>
-</AccordionGroup>
-
-These foundational structures illustrate how individuals collaboratively produce and interpret talk in interaction, ensuring coherent and meaningful communication.
-
 ### Speech Configuration in VAPI
 
 Speech configuration is a crucial aspect of designing a voice assistant that delivers a seamless and engaging user experience. By customizing the assistant's speech settings, you can optimize its responsiveness, naturalness, and timing during interactions with users.
@@ -50,6 +16,8 @@ These plans ensure that the assistant does not interrupt the customer and also p
 
 Adjusting these parameters helps tailor the assistant's responsiveness to different conversational dynamics.
 
+For more information on the anatomy of conversation and how it relates to speech recognition, see the [Conversational Analysis](/customization/conversational-analysis) guide.
+
 <CardGroup cols={2}>
 
 <Card 
@@ -58,10 +26,9 @@ Adjusting these parameters helps tailor the assistant's responsiveness to differ
   href='#transcriber-settings'
 >
   Specify the provider, language, and model for speech transcription.
-
   <Tip 
     title='API Endpoint'>
-    [rest](https://docs.vapi.ai/api-reference/assistants/create#request.body.speech)
+    [REST](https://docs.vapi.ai/api-reference/assistants/create#request.body.speech)
   </Tip>
 </Card>
 
@@ -74,7 +41,7 @@ Adjusting these parameters helps tailor the assistant's responsiveness to differ
 
  <Tip 
     title='API Endpoint'>
-    [rest](https://docs.vapi.ai/api-reference/assistants/create#request.body.speech)
+    [REST](https://docs.vapi.ai/api-reference/assistants/create#request.body.speech)
   </Tip>
   
  </Card>
@@ -88,7 +55,7 @@ Adjusting these parameters helps tailor the assistant's responsiveness to differ
 
 <Tip 
     title='API Endpoint'>
-    [rest](https://docs.vapi.ai/api-reference/assistants/create#request.body.speech)
+    [REST](https://docs.vapi.ai/api-reference/assistants/create#request.body.speech)
   </Tip>
  </Card>
 
@@ -117,9 +84,28 @@ Adjusting these parameters helps tailor the assistant's responsiveness to differ
 <Tip 
   title='API Endpoint'>
   
-  [rest](https://docs.vapi.ai/api-reference/assistants/create#request.body.speech)
+  [REST](https://docs.vapi.ai/api-reference/assistants/create#request.body.speech)
+
   </Tip>
 
+</Card>
+<Card  
+  title='Best Practices' 
+  icon='solid wand-magic-sparkles' 
+  href='#best-practices'>
+
+  Here are some best practices for configuring speech settings to enhance the conversational experience and optimize user engagement.
+
+
+</Card>
+<Card  
+  title='Custom Endpoints' 
+  icon='solid wand-magic-sparkles' 
+  href='#custom-endpoints'>
+
+The custom endpointing rules in Vapi's speech configuration are particularly useful in several scenarios such as non-standard speech environments
+
+
 </Card>
 </CardGroup>
 
@@ -215,15 +201,14 @@ Use transcription-based endpointing, with specific timeouts after punctuation, n
 **Example**: In insurance claims, enabling `smartEndpointingEnabled` helps avoid interruptions while customers think through and formulate responses.
 
 
-### STOP SPEAKING PLAN
+### Stop Speaking Plan
 
 - **Words to Stop Speaking**: Specify the number of words a user must say before the assistant stops talking, preventing interruptions from brief interjections.
 
 - Voice Activity Detection: Set the duration of user speech required to trigger the assistant to stop speaking, minimizing overlaps.
 
 - Pause Before Resuming: Control the delay before the assistant resumes speaking after being interrupted, ensuring a natural conversational flow.
 
-
 The stopSpeakingPlan allows you to configure how the assistant stops speaking, preventing interruptions and ensuring a smooth conversation. Here's an example:
 
 ```json
@@ -250,15 +235,18 @@ This enhanced explanation provides concrete examples and clear descriptions of t
   - **Silence Timeout**: Define the duration the assistant waits during user silence before responding or prompting, balancing responsiveness with user comfort.
   - **Max Duration**: Set limits on interaction lengths to manage session times effectively. This parameter helps prevent overly long interactions that may lead to user fatigue or disengagement.
 
-## BEST PRACTICES
-Best Practices
-
-Adapt to User Style: Configure settings based on conversational dynamics, such as enabling smart endpointing for mid-thought pauses.
-
-Minimize Noise Interference: Adjust parameters to handle noisy environments effectively.
-
-Optimize Conversational Flow: Balance responsiveness and non-intrusiveness by testing different configurations.
-
-Tailor for Use Cases: Customize settings for specific scenarios, such as tech support or healthcare applications.
-
-Iterate and Improve: Continuously test configurations with real users and refine based on feedback.
+### Custom Endpoints
+- **Complex Conversations**: In situations where users might pause mid-thought or have varying speech patterns, the `BothCustomEndpointingRule` can help create a more natural flow. This is especially valuable in customer service or healthcare applications where conversations can be nuanced and unpredictable.
+- **Technical Discussions**: For calls involving technical details or numbers, the `TranscriptionEndpointingPlan`'s `onNumberSeconds` parameter can be adjusted to allow more time after number sequences. This is useful in financial services, tech support, or any scenario where numerical information is frequently exchanged.
+- **Multilingual Support**: The `AssistantCustomEndpointingRule` can be tailored to account for different speech patterns and pauses typical in various languages, improving the assistant's responsiveness in multilingual environments.
+- **Emotional or Sensitive Conversations**: In counseling or mental health applications, the `CustomerCustomEndpointingRule` can be fine-tuned to allow for longer pauses, giving users more time to process and respond without interruption.
+- **High-Noise Environments**: For calls from locations with significant background noise, like factories or busy streets, these rules can be adjusted to better distinguish between speech and ambient sounds, improving the overall conversation quality.
+- **Elderly or Speech-Impaired Users**: The endpointing rules can be customized to accommodate slower speech patterns or frequent pauses, ensuring the assistant doesn't interrupt prematurely.
+
+
+### Best Practices
+  - **Adapt to User Style**: Configure settings based on conversational dynamics, such as enabling smart endpointing for mid-thought pauses.
+  - **Minimize Noise Interference**: Adjust parameters to handle noisy environments effectively.
+  - **Optimize Conversational Flow**: Balance responsiveness and non-intrusiveness by testing different configurations.
+  - **Tailor for Use Cases**: Customize settings for specific scenarios, such as tech support or healthcare applications.
+  - **Iterate and Improve**: Continuously test configurations with real users and refine based on feedback.