Skip to content

Commit 57644e7

Browse files
authored
Merge pull request #16 from zc277584121/add-memsearch
Add semantic memory search use case
2 parents 7b874a8 + 5f18af5 commit 57644e7

File tree

2 files changed

+71
-0
lines changed

2 files changed

+71
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ Solving the bottleneck of OpenClaw adaptation: Not ~~skills~~, but finding **way
7373
| [AI Earnings Tracker](usecases/earnings-tracker.md) | Track tech/AI earnings reports with automated previews, alerts, and detailed summaries. |
7474
| [Personal Knowledge Base (RAG)](usecases/knowledge-base-rag.md) | Build a searchable knowledge base by dropping URLs, tweets, and articles into chat. |
7575
| [Market Research & Product Factory](usecases/market-research-product-factory.md) | Mine Reddit and X for real pain points using the Last 30 Days skill, then have OpenClaw build MVPs that solve them. |
76+
| [Semantic Memory Search](usecases/semantic-memory-search.md) | Add vector-powered semantic search to your OpenClaw markdown memory files with hybrid retrieval and auto-sync. |
7677

7778
## Finance & Trading
7879

usecases/semantic-memory-search.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# Semantic Memory Search
2+
3+
OpenClaw's built-in memory system stores everything as markdown files — but as memories grow over weeks and months, finding that one decision from last Tuesday becomes impossible. There is no search, just scrolling through files.
4+
5+
This use case adds **vector-powered semantic search** on top of OpenClaw's existing markdown memory files using [memsearch](https://github.com/zilliztech/memsearch), so you can instantly find any past memory by meaning, not just keywords.
6+
7+
## What It Does
8+
9+
- Index all your OpenClaw markdown memory files into a vector database (Milvus) with a single command
10+
- Search by meaning: "what caching solution did we pick?" finds the relevant memory even if the word "caching" does not appear
11+
- Hybrid search (dense vectors + BM25 full-text) with RRF reranking for best results
12+
- SHA-256 content hashing means unchanged files are never re-embedded — zero wasted API calls
13+
- File watcher auto-reindexes when memory files change, so the index is always up to date
14+
- Works with any embedding provider: OpenAI, Google, Voyage, Ollama, or fully local (no API key needed)
15+
16+
## Pain Point
17+
18+
OpenClaw's memory is stored as plain markdown files. This is great for portability and human readability, but it has no search. As your memory grows, you either have to grep through files (keyword-only, misses semantic matches) or load entire files into context (wastes tokens on irrelevant content). You need a way to ask "what did I decide about X?" and get the exact relevant chunk, regardless of phrasing.
19+
20+
## Skills You Need
21+
22+
- No OpenClaw skills required — memsearch is a standalone Python CLI/library
23+
- Python 3.10+ with pip or uv
24+
25+
## How to Set It Up
26+
27+
1. Install memsearch:
28+
```bash
29+
pip install memsearch
30+
```
31+
32+
2. Run the interactive config wizard:
33+
```bash
34+
memsearch config init
35+
```
36+
37+
3. Index your OpenClaw memory directory:
38+
```bash
39+
memsearch index ~/path/to/your/memory/
40+
```
41+
42+
4. Search your memories by meaning:
43+
```bash
44+
memsearch search "what caching solution did we pick?"
45+
```
46+
47+
5. For live sync, start the file watcher — it auto-indexes on every file change:
48+
```bash
49+
memsearch watch ~/path/to/your/memory/
50+
```
51+
52+
6. For a fully local setup (no API keys), install the local embedding provider:
53+
```bash
54+
pip install "memsearch[local]"
55+
memsearch config set embedding.provider local
56+
memsearch index ~/path/to/your/memory/
57+
```
58+
59+
## Key Insights
60+
61+
- **Markdown stays the source of truth.** The vector index is just a derived cache — you can rebuild it anytime with `memsearch index`. Your memory files are never modified.
62+
- **Smart dedup saves money.** Each chunk is identified by a SHA-256 content hash. Re-running `index` only embeds new or changed content, so you can run it as often as you like without wasting embedding API calls.
63+
- **Hybrid search beats pure vector search.** Combining semantic similarity (dense vectors) with keyword matching (BM25) via Reciprocal Rank Fusion catches both meaning-based and exact-match queries.
64+
65+
## Related Links
66+
67+
- [memsearch GitHub](https://github.com/zilliztech/memsearch) — the library powering this use case
68+
- [memsearch Documentation](https://zilliztech.github.io/memsearch/) — full CLI reference, Python API, and architecture
69+
- [OpenClaw](https://github.com/openclaw/openclaw) — the memory architecture that inspired memsearch
70+
- [Milvus](https://milvus.io/) — the vector database backend

0 commit comments

Comments
 (0)