This guide explains how to integrate rlm-rs with AI coding assistants through plugins, skills, and commands. While the examples focus on Claude Code, the patterns apply to any AI assistant that can execute shell commands.
rlm-cli is designed as a CLI-first tool that AI assistants invoke via shell execution. This architecture enables:
- Universal Compatibility: Any assistant with shell access can use rlm-cli
- No Custom APIs: Standard stdin/stdout/stderr communication
- JSON Output: Machine-readable format for programmatic integration
- Stateless Commands: Each invocation is independent (state lives in SQLite)
The rlm-rs Claude Code plugin implements the RLM pattern:
User-invocable shortcuts for common operations:
| Command | Description | Maps To |
|---|---|---|
/rlm-load |
Load file into RLM | rlm-cli load <file> |
/rlm-search |
Search loaded content | rlm-cli search <query> |
/rlm-status |
Show RLM state | rlm-cli status |
/rlm-analyze |
Full RLM analysis workflow | Orchestrated multi-step |
Example Skill Definition (.claude/skills/rlm-load.md):
---
name: rlm-load
description: Load a file or directory into RLM for analysis
arguments:
- name: path
description: File or directory to load
required: true
- name: name
description: Buffer name (defaults to filename)
required: false
---
Load content into RLM for semantic search and chunk-based analysis.
## Workflow
1. Check if rlm-rs is installed: `which rlm-cli`
2. Initialize if needed: `rlm-cli init`
3. Load the content: `rlm-cli load {{path}} --name {{name}} --chunker semantic`
4. Report status: `rlm-cli status --format json`
## Output
Report the number of chunks created and confirm embeddings were generated.Specialized agents for chunk-level processing:
rlm-subcall Agent (.claude/agents/rlm-subcall.md):
---
name: rlm-subcall
model: haiku
description: Efficient chunk-level analysis for RLM workflow
tools:
- Bash
- Read
---
You are a focused analysis agent processing individual chunks from large documents.
## Instructions
1. Retrieve the chunk: `rlm-cli chunk get <chunk_id>`
2. Analyze according to the prompt
3. Return structured JSON findings:
```json
{
"chunk_id": <id>,
"findings": [...],
"relevance": "high|medium|low",
"summary": "Brief summary"
}Keep responses concise. You're part of a larger workflow.
**rlm-synthesizer Agent** (`.claude/agents/rlm-synthesizer.md`):
```markdown
---
name: rlm-synthesizer
model: sonnet
description: Synthesize findings from multiple chunk analyses
tools:
- Read
- Bash
---
You aggregate results from multiple rlm-subcall analyses.
## Instructions
1. Review all chunk findings
2. Identify patterns and connections
3. Synthesize into coherent narrative
4. Highlight key insights and recommendations
Automated triggers for RLM operations:
Auto-load on large files (.claude/hooks/large-file-rlm.md):
---
event: PostToolUse
tool: Read
---
If the file read was larger than 50KB, suggest loading it into RLM:
"This is a large file. Consider using `/rlm-load {{file_path}}` for semantic search."Any AI assistant can integrate with rlm-rs using these patterns:
# 1. Load content (one-time setup)
rlm-cli load large-document.md --name docs
# 2. Search for relevant chunks
RESULTS=$(rlm-cli --format json search "your query" --top-k 5)
# 3. Extract chunk IDs
CHUNK_IDS=$(echo "$RESULTS" | jq -r '.results[].chunk_id')
# 4. Retrieve and process each chunk
for ID in $CHUNK_IDS; do
CONTENT=$(rlm-cli chunk get $ID)
# Process $CONTENT...
done# Find specific patterns
rlm-cli grep docs "TODO|FIXME|HACK" --format json --max-matches 50
# Get context around matches
rlm-cli grep docs "error.*handling" --window 200# Broad search first
rlm-cli search "authentication" --top-k 20
# Narrow down
rlm-cli search "JWT token validation" --top-k 5 --mode semantic
# Exact match
rlm-cli search "validateToken function" --mode bm25All commands with --format json return structured data:
Search Results:
{
"count": 3,
"mode": "hybrid",
"query": "authentication",
"results": [
{
"chunk_id": 42,
"buffer_id": 1,
"buffer_name": "auth.rs",
"score": 0.0328,
"semantic_score": 0.0499,
"bm25_score": 0.0000016
}
]
}Status:
{
"initialized": true,
"db_path": ".rlm/rlm-state.db",
"db_size_bytes": 245760,
"buffer_count": 3,
"chunk_count": 42,
"total_content_bytes": 125000,
"embeddings_count": 42
}Chunk:
{
"id": 42,
"buffer_id": 1,
"buffer_name": "auth.rs",
"index": 3,
"byte_range": [12000, 15000],
"size": 3000,
"content": "...",
"has_embedding": true
}Copilot can invoke rlm-rs through its terminal integration:
@terminal rlm-cli load src/ --name code
@terminal rlm-cli search "error handling"
Codex can execute rlm-rs commands directly:
codex "Load the documentation and find sections about API authentication"
# Codex runs: rlm-cli load docs/ && rlm-rs search "API authentication"These tools can use rlm-cli as an external helper:
# In .aider.conf.yml or similar
tools:
- name: rlm-search
command: rlm-cli --format json search "$QUERY"Extensions should use execFile instead of exec for security (avoids shell injection):
import { execFile } from 'child_process';
import { promisify } from 'util';
const execFileAsync = promisify(execFile);
interface SearchResult {
chunk_id: number;
score: number;
}
interface SearchResponse {
results: SearchResult[];
}
async function searchRLM(query: string): Promise<SearchResult[]> {
// Using execFile (not exec) prevents shell injection
const { stdout } = await execFileAsync('rlm-cli', [
'--format', 'json',
'search', query
]);
const response: SearchResponse = JSON.parse(stdout);
return response.results;
}rlm-cli load src/ --chunker semantic --chunk-size 3000Semantic chunking respects function and class boundaries.
rlm-cli load src/auth/ --name auth-module
rlm-cli load src/api/ --name api-handlers
rlm-cli load docs/ --name documentationThis makes search results more interpretable.
rlm-cli search "query" --mode hybridHybrid combines semantic understanding with keyword matching.
Instead of sequential calls, use parallel Task invocations:
# Good: Parallel
Task(rlm-subcall, chunk 12) || Task(rlm-subcall, chunk 27) || Task(rlm-subcall, chunk 33)
# Avoid: Sequential
Task(rlm-subcall, chunk 12)
Task(rlm-subcall, chunk 27)
Task(rlm-subcall, chunk 33)
# After subcall analysis
rlm-cli add-buffer auth-analysis "$(cat subcall-results.json)"
# Later retrieval
rlm-cli show auth-analysisWhen integrating rlm-rs into AI workflows, proper error handling ensures graceful recovery and good user experience. This section provides structured patterns for handling common errors.
All rlm-cli commands return:
- Exit code 0: Success
- Exit code 1: Error (details in stderr)
With JSON format, errors are structured:
{
"error": "storage error: RLM not initialized. Run: rlm-rs init",
"code": "NOT_INITIALIZED"
}| Error Message | Cause | Recovery Strategy |
|---|---|---|
RLM not initialized |
Database not created | Run rlm-cli init |
buffer not found: <name> |
Buffer doesn't exist | Run rlm-cli list to verify |
chunk not found: <id> |
Invalid chunk ID | Re-run search to get valid IDs |
No results found |
Query too specific | Broaden query or lower threshold |
embedding error |
Model loading issue | Check disk space, retry once |
file not found |
Invalid path | Verify path exists before load |
# Robust error handling for AI assistants
RESULT=$(rlm-cli --format json search "$QUERY" 2>&1)
EXIT_CODE=$?
if [ $EXIT_CODE -ne 0 ]; then
# Parse error
ERROR=$(echo "$RESULT" | jq -r '.error // empty')
case "$ERROR" in
*"not initialized"*)
rlm-cli init
# Retry original command
RESULT=$(rlm-cli --format json search "$QUERY")
;;
*"buffer not found"*)
echo "Buffer not found. Available buffers:"
rlm-cli list
;;
*"No results"*)
echo "No results. Try broader query or: --threshold 0.1"
;;
*)
echo "Error: $ERROR"
;;
esac
fiFor transient errors (embedding model loading, database locks):
MAX_RETRIES=3
RETRY_DELAY=1
for i in $(seq 1 $MAX_RETRIES); do
RESULT=$(rlm-cli --format json chunk embed "$BUFFER" 2>&1)
if [ $? -eq 0 ]; then
break
fi
if [ $i -lt $MAX_RETRIES ]; then
sleep $RETRY_DELAY
RETRY_DELAY=$((RETRY_DELAY * 2)) # Exponential backoff
fi
doneBefore complex workflows, verify prerequisites:
# Check 1: rlm-rs is installed
if ! command -v rlm-rs &> /dev/null; then
echo "rlm-rs not found. Install with: cargo install rlm-cli"
exit 1
fi
# Check 2: Database is initialized
if ! rlm-rs status &> /dev/null; then
rlm-cli init
fi
# Check 3: Content is loaded
BUFFER_COUNT=$(rlm-cli --format json status | jq '.buffer_count')
if [ "$BUFFER_COUNT" -eq 0 ]; then
echo "No content loaded. Use: rlm-rs load <file>"
exit 1
fi
# Check 4: Embeddings exist for semantic search
EMBED_COUNT=$(rlm-cli --format json chunk status | jq '.embedded_chunks')
if [ "$EMBED_COUNT" -eq 0 ]; then
echo "No embeddings. Generating..."
rlm-cli chunk embed --all
fiWhen semantic search fails, fall back to BM25:
# Try semantic first
RESULT=$(rlm-cli --format json search "$QUERY" --mode semantic 2>&1)
if echo "$RESULT" | jq -e '.error' > /dev/null 2>&1; then
# Fall back to BM25 (keyword search, no embeddings required)
RESULT=$(rlm-cli --format json search "$QUERY" --mode bm25)
fiWhen reporting errors to users, provide actionable guidance:
**Good**: "Buffer 'config' not found. Available buffers: main, auth. Did you mean one of these?"
**Bad**: "Error: buffer not found: config"# Check installation
which rlm-cli
# Install if missing
cargo install rlm-cli
# or
brew install zircote/tap/rlm-rsrlm-cli init- Check if content is loaded:
rlm-cli list - Verify embeddings exist:
rlm-cli chunk status - Try broader query or lower threshold:
--threshold 0.1
Ensure you're using --format json:
rlm-cli --format json search "query" # Correct
rlm-cli search "query" --format json # Also correctReady-to-use system prompts for AI assistants integrating with rlm-rs are available in the prompts/ directory:
| Template | Purpose | Recommended Model |
|---|---|---|
| rlm-orchestrator.md | Coordinates search, dispatch, and synthesis | sonnet |
| rlm-analyst.md | Analyzes individual chunks | haiku |
| rlm-synthesizer.md | Aggregates analyst findings | sonnet |
- Orchestrator receives user request and searches for relevant chunks
- Analysts (parallel) process individual chunks and return structured findings
- Synthesizer aggregates findings into a coherent report
User Request
│
▼
┌─────────────┐
│ Orchestrator │──▶ rlm-rs search "query"
└─────────────┘
│
▼ dispatch
┌─────────────────────────────────┐
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │Analyst 1│ │Analyst 2│ │Analyst N│ │ (parallel)
│ └─────────┘ └─────────┘ └─────────┘ │
└─────────────────────────────────┘
│ collect
▼
┌─────────────┐
│ Synthesizer │──▶ Final Report
└─────────────┘
- RLM-Inspired Design - Architectural philosophy
- CLI Reference - Complete command documentation
- Architecture - Internal implementation details
- Prompt Templates - System prompts for AI integration

