Skip to content

Commit d35065c

Browse files
authored
Feature: baidu vector db integration (mem0ai#2929)
1 parent cdee6a4 commit d35065c

File tree

7 files changed

+683
-2
lines changed

7 files changed

+683
-2
lines changed

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ install:
1313
install_all:
1414
pip install ruff==0.6.9 groq together boto3 litellm ollama chromadb weaviate weaviate-client sentence_transformers vertexai \
1515
google-generativeai elasticsearch opensearch-py vecs "pinecone<7.0.0" pinecone-text faiss-cpu langchain-community \
16-
upstash-vector azure-search-documents langchain-memgraph langchain-neo4j rank-bm25
16+
upstash-vector azure-search-documents langchain-memgraph langchain-neo4j rank-bm25 pymochow
1717

1818
# Format code with ruff
1919
format:
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
---
2+
title: Baidu VectorDB (Mochow)
3+
---
4+
5+
[Baidu VectorDB](https://cloud.baidu.com/doc/VDB/index.html) is an enterprise-level distributed vector database service developed by Baidu Intelligent Cloud. It is powered by Baidu's proprietary "Mochow" vector database kernel, providing high performance, availability, and security for vector search.
6+
7+
### Usage
8+
9+
```python
10+
import os
11+
from mem0 import Memory
12+
13+
config = {
14+
"vector_store": {
15+
"provider": "baidu",
16+
"config": {
17+
"endpoint": "http://your-mochow-endpoint:8287",
18+
"account": "root",
19+
"api_key": "your-api-key",
20+
"database_name": "mem0",
21+
"table_name": "mem0_table",
22+
"embedding_model_dims": 1536,
23+
"metric_type": "COSINE"
24+
}
25+
}
26+
}
27+
28+
m = Memory.from_config(config)
29+
messages = [
30+
{"role": "user", "content": "I'm planning to watch a movie tonight. Any recommendations?"},
31+
{"role": "assistant", "content": "How about a thriller movie? They can be quite engaging."},
32+
{"role": "user", "content": "I'm not a big fan of thriller movies but I love sci-fi movies."},
33+
{"role": "assistant", "content": "Got it! I'll avoid thriller recommendations and suggest sci-fi movies in the future."}
34+
]
35+
m.add(messages, user_id="alice", metadata={"category": "movies"})
36+
```
37+
38+
### Config
39+
40+
Here are the available parameters for the `mochow` config:
41+
42+
| Parameter | Description | Default Value |
43+
| --- | --- | --- |
44+
| `endpoint` | Endpoint URL for your Baidu VectorDB instance | Required |
45+
| `account` | Baidu VectorDB account name | `root` |
46+
| `api_key` | API key for accessing Baidu VectorDB | Required |
47+
| `database_name` | Name of the database | `mem0` |
48+
| `table_name` | Name of the table | `mem0_table` |
49+
| `embedding_model_dims` | Dimensions of the embedding model | `1536` |
50+
| `metric_type` | Distance metric for similarity search | `L2` |
51+
52+
### Distance Metrics
53+
54+
The following distance metrics are supported:
55+
56+
- `L2`: Euclidean distance (default)
57+
- `IP`: Inner product
58+
- `COSINE`: Cosine similarity
59+
60+
### Index Configuration
61+
62+
The vector index is automatically configured with the following HNSW parameters:
63+
64+
- `m`: 16 (number of connections per element)
65+
- `efconstruction`: 200 (size of the dynamic candidate list)
66+
- `auto_build`: true (automatically build index)
67+
- `auto_build_index_policy`: Incremental build with 10000 rows increment

docs/docs.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,8 @@
146146
"components/vectordbs/dbs/vertex_ai",
147147
"components/vectordbs/dbs/weaviate",
148148
"components/vectordbs/dbs/faiss",
149-
"components/vectordbs/dbs/langchain"
149+
"components/vectordbs/dbs/langchain",
150+
"components/vectordbs/dbs/baidu"
150151
]
151152
}
152153
]
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
from enum import Enum
2+
from typing import Any, Dict
3+
4+
from pydantic import BaseModel, Field, model_validator
5+
6+
7+
class BaiduDBConfig(BaseModel):
8+
endpoint: str = Field("http://localhost:8287", description="Endpoint URL for Baidu VectorDB")
9+
account: str = Field("root", description="Account for Baidu VectorDB")
10+
api_key: str = Field(None, description="API Key for Baidu VectorDB")
11+
database_name: str = Field("mem0", description="Name of the database")
12+
table_name: str = Field("mem0", description="Name of the table")
13+
embedding_model_dims: int = Field(1536, description="Dimensions of the embedding model")
14+
metric_type: str = Field("L2", description="Metric type for similarity search")
15+
16+
@model_validator(mode="before")
17+
@classmethod
18+
def validate_extra_fields(cls, values: Dict[str, Any]) -> Dict[str, Any]:
19+
allowed_fields = set(cls.model_fields.keys())
20+
input_fields = set(values.keys())
21+
extra_fields = input_fields - allowed_fields
22+
if extra_fields:
23+
raise ValueError(
24+
f"Extra fields not allowed: {', '.join(extra_fields)}. Please input only the following fields: {', '.join(allowed_fields)}"
25+
)
26+
return values
27+
28+
model_config = {
29+
"arbitrary_types_allowed": True,
30+
}

0 commit comments

Comments
 (0)