docs: BYOK trieve

rumbleFTW · rumbleFTW · commit e486bda1448c · 2025-01-24T22:34:34.000+05:30
diff --git a/fern/customization/bring-your-own-vectors/trieve.mdx b/fern/customization/bring-your-own-vectors/trieve.mdx
@@ -1,62 +1,168 @@
 ---
-title: Bring your own chunks/vectors from Trieve
-subtitle: Use existing chunks/vectors from [Trieve](https://trieve.ai)
+title: Using Trieve with Vapi
+subtitle: Leverage Trieve's powerful vector database capabilities with Vapi
 slug: customization/bring-your-own-vectors/trieve
 ---
 
-Vapi supports Trieve as a knowledgebase provider, allowing you to leverage your existing document embeddings and chunks. While Vapi maintains its own storage of documents and vectors, you can seamlessly integrate with your Trieve datasets.
+# Using Trieve with Vapi
 
-## Use Cases
+Vapi offers two ways to integrate with [Trieve](https://trieve.ai):
+
+1. **Direct Integration**: Create and manage Trieve datasets directly through Vapi
+2. **BYOK (Bring Your Own Knowledge)**: Import your existing Trieve datasets into Vapi
+
+## Direct Integration with Trieve
+
+When using Trieve directly through Vapi, you can create and manage datasets, but they'll be tied to Vapi's account. This approach offers:
+
+- Quick setup with minimal configuration
+- Basic dataset management through Vapi's API
+- Limited customization options
+
+### Setting up Direct Integration
+
+1. Navigate to the [Vapi dashboard credentials page](https://dashboard.vapi.ai/keys)
+2. Add your Trieve API key from [Trieve's dashboard](https://dashboard.trieve.ai/org/keys)
+3. Create a new knowledge base with Trieve as the provider:
+
+```json
+{
+  "name": "my-trieve-kb",
+  "provider": "trieve",
+  "searchPlan": {
+    "scoreThreshold": 0.2,
+    "searchType": "semantic"
+  },
+  "createPlan": {
+    "type": "create",
+    "chunkPlans": [
+      {
+        "fileIds": ["file-123", "file-456"],
+        "websites": ["https://example.com"],
+        "targetSplitsPerChunk": 50,
+        "rebalanceChunks": true
+      }
+    ]
+  }
+}
+```
+
+## BYOK with Trieve (Recommended)
+
+The BYOK approach offers more flexibility and control over your datasets. You can:
+
+- Fully manage your datasets in Trieve's native interface
+- Use Trieve's advanced features like:
+  - Custom chunking rules
+  - Search playground testing
+  - Manual chunk editing
+  - Website crawling
+  - Dataset visualization
+
+### Step 1: Set Up Trieve Dataset
+
+1. Create an account at [Trieve](https://trieve.ai)
+2. Create a new dataset using Trieve's dashboard
+3. Add content through various methods:
+
+#### Document Upload
+
+- Upload documents directly through Trieve's interface
+- Supported formats: PDF, DOCX, TXT, MD
+- Configure chunking parameters:
+  - Chunk size
+  - Overlap
+  - Split delimiters
+
+#### Website Crawling
+
+Trieve offers powerful website crawling capabilities:
+
+```json
+{
+  "url": "https://yourdomain.com",
+  "configuration": {
+    "maxPages": 100,
+    "allowedDomains": ["yourdomain.com"],
+    "excludePatterns": ["/admin/*", "/login"],
+    "includePatterns": ["/docs/*", "/blog/*"]
+  }
+}
+```
+
+### Step 2: Test and Refine
+
+Use Trieve's search playground to:
+
+- Test semantic search queries
+- Adjust chunk sizes
+- Edit chunks manually
+- Visualize vector embeddings
+- Fine-tune relevance scores
+
+### Step 3: Import to Vapi
+
+Once your dataset is optimized in Trieve, import it to Vapi:
+
+```json
+{
+  "name": "trieve-byok",
+  "provider": "trieve",
+  "searchPlan": {
+    "scoreThreshold": 0.2,
+    "searchType": "semantic"
+  },
+  "createPlan": {
+    "type": "import",
+    "providerId": "<Your Trieve Dataset ID>"
+  }
+}
+```
 
-### Existing Knowledge Base Migration
+## Best Practices
 
-If you've already invested time in building and organizing your knowledge base in Trieve, you can continue using those vectors without having to reprocess your documents. This is particularly useful for:
+1. **Dataset Organization**
 
-- Large document collections that took significant time to process
-- Carefully curated and cleaned datasets
-- Custom-chunked documents with specific segmentation rules
+   - Keep datasets focused on specific topics
+   - Use meaningful dataset names
+   - Document your chunking configurations
 
-### Parallel Systems
+2. **Content Quality**
 
-You might want to use both Trieve's native interface and Vapi simultaneously:
+   - Clean and preprocess documents before uploading
+   - Review and edit chunks in Trieve's interface
+   - Test search relevance before importing to Vapi
 
-- Use Trieve's UI for content management and organization
-- Leverage Vapi's chat interface and API capabilities
-- Maintain consistency across both platforms
+3. **Performance Optimization**
 
-## Integration Steps
+   - Monitor chunk sizes (recommended: 200-1000 tokens)
+   - Use appropriate search types for your use case
+   - Adjust score thresholds based on testing
 
-1. **Configure Trieve Credentials**
+4. **Maintenance**
+   - Regularly update content in Trieve
+   - Monitor search performance
+   - Keep API keys secure and updated
 
-   - Navigate to the credentials page in your [Vapi dashboard](https://dashboard.vapi.ai/keys)
-   - Add your Trieve API key for authentication from [Trieve](https://dashboard.trieve.ai/org/keys)
+## Troubleshooting
 
-2. **Create a New Knowledge Base**
+Common issues and solutions:
 
-   - When setting up a new knowledge base, provide:
-     - Your Trieve datasetId as the providerId.
-     - Appropriate search configuration parameters.
-     - Vapi will then automatically use your Trieve dataset as the knowledge base.
+1. **Poor Search Results**
 
-   Example configuration:
+   - Adjust score threshold
+   - Try different search types (semantic, hybrid, BM25)
+   - Review chunk sizes and content quality
 
-   ```json
-   {
-     "name": "trieve-byok",
-     "provider": "trieve",
-     "searchPlan": {
-       "scoreThreshold": 0.2,
-       "searchType": "semantic"
-     },
-     "createPlan": {
-       "type": "import",
-       "providerId": "<Your datasetId from Trieve>"
-     }
-   }
-   ```
+2. **Integration Issues**
 
-## Best Practices
+   - Verify API keys are correct
+   - Ensure dataset IDs are valid
+   - Check network connectivity
+
+3. **Performance Problems**
+   - Reduce chunk sizes
+   - Optimize search configurations
+   - Consider splitting large datasets
 
-- Ensure your Trieve API key has appropriate permissions
-- Keep track of which datasetIds correspond to which knowledge bases
-- Monitor vector synchronization to ensure consistency
+Need help? Contact [support@vapi.ai](mailto:support@vapi.ai) for assistance.