Skip to content

Commit e486bda

Browse files
committed
docs: BYOK trieve
1 parent 4a794e0 commit e486bda

File tree

1 file changed

+148
-42
lines changed
  • fern/customization/bring-your-own-vectors

1 file changed

+148
-42
lines changed
Lines changed: 148 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,62 +1,168 @@
11
---
2-
title: Bring your own chunks/vectors from Trieve
3-
subtitle: Use existing chunks/vectors from [Trieve](https://trieve.ai)
2+
title: Using Trieve with Vapi
3+
subtitle: Leverage Trieve's powerful vector database capabilities with Vapi
44
slug: customization/bring-your-own-vectors/trieve
55
---
66

7-
Vapi supports Trieve as a knowledgebase provider, allowing you to leverage your existing document embeddings and chunks. While Vapi maintains its own storage of documents and vectors, you can seamlessly integrate with your Trieve datasets.
7+
# Using Trieve with Vapi
88

9-
## Use Cases
9+
Vapi offers two ways to integrate with [Trieve](https://trieve.ai):
10+
11+
1. **Direct Integration**: Create and manage Trieve datasets directly through Vapi
12+
2. **BYOK (Bring Your Own Knowledge)**: Import your existing Trieve datasets into Vapi
13+
14+
## Direct Integration with Trieve
15+
16+
When using Trieve directly through Vapi, you can create and manage datasets, but they'll be tied to Vapi's account. This approach offers:
17+
18+
- Quick setup with minimal configuration
19+
- Basic dataset management through Vapi's API
20+
- Limited customization options
21+
22+
### Setting up Direct Integration
23+
24+
1. Navigate to the [Vapi dashboard credentials page](https://dashboard.vapi.ai/keys)
25+
2. Add your Trieve API key from [Trieve's dashboard](https://dashboard.trieve.ai/org/keys)
26+
3. Create a new knowledge base with Trieve as the provider:
27+
28+
```json
29+
{
30+
"name": "my-trieve-kb",
31+
"provider": "trieve",
32+
"searchPlan": {
33+
"scoreThreshold": 0.2,
34+
"searchType": "semantic"
35+
},
36+
"createPlan": {
37+
"type": "create",
38+
"chunkPlans": [
39+
{
40+
"fileIds": ["file-123", "file-456"],
41+
"websites": ["https://example.com"],
42+
"targetSplitsPerChunk": 50,
43+
"rebalanceChunks": true
44+
}
45+
]
46+
}
47+
}
48+
```
49+
50+
## BYOK with Trieve (Recommended)
51+
52+
The BYOK approach offers more flexibility and control over your datasets. You can:
53+
54+
- Fully manage your datasets in Trieve's native interface
55+
- Use Trieve's advanced features like:
56+
- Custom chunking rules
57+
- Search playground testing
58+
- Manual chunk editing
59+
- Website crawling
60+
- Dataset visualization
61+
62+
### Step 1: Set Up Trieve Dataset
63+
64+
1. Create an account at [Trieve](https://trieve.ai)
65+
2. Create a new dataset using Trieve's dashboard
66+
3. Add content through various methods:
67+
68+
#### Document Upload
69+
70+
- Upload documents directly through Trieve's interface
71+
- Supported formats: PDF, DOCX, TXT, MD
72+
- Configure chunking parameters:
73+
- Chunk size
74+
- Overlap
75+
- Split delimiters
76+
77+
#### Website Crawling
78+
79+
Trieve offers powerful website crawling capabilities:
80+
81+
```json
82+
{
83+
"url": "https://yourdomain.com",
84+
"configuration": {
85+
"maxPages": 100,
86+
"allowedDomains": ["yourdomain.com"],
87+
"excludePatterns": ["/admin/*", "/login"],
88+
"includePatterns": ["/docs/*", "/blog/*"]
89+
}
90+
}
91+
```
92+
93+
### Step 2: Test and Refine
94+
95+
Use Trieve's search playground to:
96+
97+
- Test semantic search queries
98+
- Adjust chunk sizes
99+
- Edit chunks manually
100+
- Visualize vector embeddings
101+
- Fine-tune relevance scores
102+
103+
### Step 3: Import to Vapi
104+
105+
Once your dataset is optimized in Trieve, import it to Vapi:
106+
107+
```json
108+
{
109+
"name": "trieve-byok",
110+
"provider": "trieve",
111+
"searchPlan": {
112+
"scoreThreshold": 0.2,
113+
"searchType": "semantic"
114+
},
115+
"createPlan": {
116+
"type": "import",
117+
"providerId": "<Your Trieve Dataset ID>"
118+
}
119+
}
120+
```
10121

11-
### Existing Knowledge Base Migration
122+
## Best Practices
12123

13-
If you've already invested time in building and organizing your knowledge base in Trieve, you can continue using those vectors without having to reprocess your documents. This is particularly useful for:
124+
1. **Dataset Organization**
14125

15-
- Large document collections that took significant time to process
16-
- Carefully curated and cleaned datasets
17-
- Custom-chunked documents with specific segmentation rules
126+
- Keep datasets focused on specific topics
127+
- Use meaningful dataset names
128+
- Document your chunking configurations
18129

19-
### Parallel Systems
130+
2. **Content Quality**
20131

21-
You might want to use both Trieve's native interface and Vapi simultaneously:
132+
- Clean and preprocess documents before uploading
133+
- Review and edit chunks in Trieve's interface
134+
- Test search relevance before importing to Vapi
22135

23-
- Use Trieve's UI for content management and organization
24-
- Leverage Vapi's chat interface and API capabilities
25-
- Maintain consistency across both platforms
136+
3. **Performance Optimization**
26137

27-
## Integration Steps
138+
- Monitor chunk sizes (recommended: 200-1000 tokens)
139+
- Use appropriate search types for your use case
140+
- Adjust score thresholds based on testing
28141

29-
1. **Configure Trieve Credentials**
142+
4. **Maintenance**
143+
- Regularly update content in Trieve
144+
- Monitor search performance
145+
- Keep API keys secure and updated
30146

31-
- Navigate to the credentials page in your [Vapi dashboard](https://dashboard.vapi.ai/keys)
32-
- Add your Trieve API key for authentication from [Trieve](https://dashboard.trieve.ai/org/keys)
147+
## Troubleshooting
33148

34-
2. **Create a New Knowledge Base**
149+
Common issues and solutions:
35150

36-
- When setting up a new knowledge base, provide:
37-
- Your Trieve datasetId as the providerId.
38-
- Appropriate search configuration parameters.
39-
- Vapi will then automatically use your Trieve dataset as the knowledge base.
151+
1. **Poor Search Results**
40152

41-
Example configuration:
153+
- Adjust score threshold
154+
- Try different search types (semantic, hybrid, BM25)
155+
- Review chunk sizes and content quality
42156

43-
```json
44-
{
45-
"name": "trieve-byok",
46-
"provider": "trieve",
47-
"searchPlan": {
48-
"scoreThreshold": 0.2,
49-
"searchType": "semantic"
50-
},
51-
"createPlan": {
52-
"type": "import",
53-
"providerId": "<Your datasetId from Trieve>"
54-
}
55-
}
56-
```
157+
2. **Integration Issues**
57158

58-
## Best Practices
159+
- Verify API keys are correct
160+
- Ensure dataset IDs are valid
161+
- Check network connectivity
162+
163+
3. **Performance Problems**
164+
- Reduce chunk sizes
165+
- Optimize search configurations
166+
- Consider splitting large datasets
59167

60-
- Ensure your Trieve API key has appropriate permissions
61-
- Keep track of which datasetIds correspond to which knowledge bases
62-
- Monitor vector synchronization to ensure consistency
168+
Need help? Contact [[email protected]](mailto:[email protected]) for assistance.

0 commit comments

Comments
 (0)