Skip to content

Commit 3103c70

Browse files
committed
Update context-engine SKILL docs; emphasize search
Revise context-engine skill documentation across .codex/ and skills/ to make `search` the recommended/primary entrypoint, add guidance to always use `search` first, and show example auto-routing. Expand and clarify tool docs: symbol_graph query types (callers, callees, definition, importers), optional graph_query capabilities and fallbacks, new admin/diagnostics commands, error fallbacks, and session/workspace recommendations. Replace many repo_search examples with search, streamline index/workspace tooling sections (remove some qdrant index instructions), and add best practices (TOON format, multi-query patterns, commit-history predictions). Minor formatting and example updates throughout for clarity and consistency.
1 parent d707dd2 commit 3103c70

File tree

2 files changed

+80
-66
lines changed

2 files changed

+80
-66
lines changed

.codex/skills/context-engine/SKILL.md

Lines changed: 46 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -7,49 +7,64 @@ description: Hybrid semantic/lexical code search with neural reranking via MCP t
77

88
Hybrid vector search (semantic + lexical) with neural reranking for codebase retrieval.
99

10+
> **IMPORTANT: Always use `search` as your FIRST tool for ANY code exploration, lookup, or question.** It auto-detects intent and routes to the best specialized tool. Only use `repo_search`, `symbol_graph`, or other tools directly when you need specific parameters or features that `search` does not expose (cross-repo, memory, admin). When in doubt, use `search`.
11+
1012
## Core Decision Tree
1113

1214
```
1315
Need to find code?
1416
├── UNSURE / GENERAL QUERY → search (RECOMMENDED DEFAULT)
15-
│ └── Auto-routes to the best tool based on query intent
17+
│ └── Auto-routes to best tool based on query intent
18+
│ └── Handles: code search, Q&A, tests, config, symbols, imports
1619
├── Simple lookup → search OR info_request
1720
├── Need filters/control → search OR repo_search
1821
├── Search across multiple repos → cross_repo_search
1922
├── Want LLM explanation → search OR context_answer
2023
├── Find similar patterns → pattern_search (if enabled)
21-
├── Find relationships → search OR symbol_graph (DEFAULT, always available)
24+
├── Find relationships
25+
│ ├── Who calls / who imports / where defined → symbol_graph (DEFAULT, always available)
26+
│ ├── What does this call → symbol_graph (query_type="callees")
27+
│ ├── Multi-hop (callers of callers) → symbol_graph (depth=2+)
28+
│ └── Impact analysis / cycles → graph_query (ONLY if NEO4J/MEMGRAPH enabled)
29+
├── Git history
30+
│ ├── Find commits → search_commits_for
31+
│ └── Predict co-changing files → search_commits_for (predict_related=true)
32+
├── Blend code + notes → context_search (include_memories=true)
2233
└── Store/recall knowledge → memory_store, memory_find
2334
```
2435

2536
## Primary Tools
2637

27-
**search** - Unified entry point (RECOMMENDED DEFAULT):
38+
**search** - ALWAYS USE FIRST (unified entry point, auto-routes):
2839
```json
2940
{"query": "authentication middleware"}
41+
{"query": "how does caching work?"} // → routes to context_answer
42+
{"query": "who calls authenticate()"} // → routes to symbol_graph
43+
{"query": "tests for payment processing"} // → routes to search_tests_for
3044
```
31-
Auto-detects intent and routes to the best tool. Returns:
32-
```json
33-
{
34-
"ok": true, "intent": "search", "confidence": 0.92,
35-
"tool": "repo_search", "result": {...}, "execution_time_ms": 245
36-
}
37-
```
38-
Handles: code search, Q&A, tests, config, symbols, imports. Use specialized tools only for cross-repo, memory, or admin operations.
45+
Auto-detects intent and routes to the best tool. Returns `{ok, intent, confidence, tool, result, execution_time_ms}`.
46+
47+
Optional params: `query`, `collection`, `limit`, `language`, `under`, `include_snippet`, `compact`, `context_lines`, `ext`, `not_glob`, `path_glob`, `output_format`, `rerank_enabled`.
48+
49+
Use specialized tools directly only for: cross-repo search, memory, admin, or when you need params `search` doesn't expose.
3950

4051
**repo_search** - Direct code search (full control):
4152
```json
4253
{"query": "authentication middleware", "limit": 10, "include_snippet": true}
4354
```
4455
Multi-query: `{"query": ["auth handler", "login validation"]}`
4556

46-
**symbol_graph** - Find callers, definitions, importers (ALWAYS available):
57+
**symbol_graph** - Find callers, callees, definitions, importers (ALWAYS available):
4758
```json
4859
{"symbol": "authenticate", "query_type": "callers", "limit": 10}
60+
{"symbol": "authenticate", "query_type": "callees", "limit": 10}
4961
{"symbol": "UserService", "query_type": "definition"}
5062
{"symbol": "utils", "query_type": "importers"}
5163
```
52-
Use `depth=2` for multi-hop (callers of callers).
64+
Query types: `callers`, `callees`, `definition`, `importers`. Use `depth=2` for multi-hop. Falls back to semantic search if no graph hits. Results include ~500-char source snippets.
65+
66+
**graph_query** (OPTIONAL -- only if NEO4J_GRAPH=1 or MEMGRAPH_GRAPH=1):
67+
Extra query types: `transitive_callers`, `transitive_callees`, `impact`, `dependencies`, `cycles`. If not in your tool list, use `symbol_graph` instead.
5368

5469
**context_answer** - LLM-generated explanation with citations:
5570
```json
@@ -82,7 +97,10 @@ Use `depth=2` for multi-hop (callers of callers).
8297
| `search_config_for` | Find config | `{"query": "database connection"}` |
8398
| `search_callers_for` | Quick caller search | `{"query": "processPayment"}` |
8499
| `search_commits_for` | Git history | `{"query": "fixed auth bug"}` |
85-
| `pattern_search` | Similar code patterns | `{"query": "retry with backoff"}` |
100+
| `search_commits_for` | Predict co-changing files | `{"path": "src/auth.py", "predict_related": true}` |
101+
| `change_history_for_path` | File change summary | `{"path": "src/auth.py", "include_commits": true}` |
102+
| `pattern_search` | Similar code patterns (if enabled) | `{"query": "retry with backoff"}` |
103+
| `search_importers_for` | Find importers | `{"query": "utils/helpers"}` |
86104

87105
## Index Management
88106

@@ -92,13 +110,23 @@ Use `depth=2` for multi-hop (callers of callers).
92110

93111
## Best Practices
94112

95-
1. **Use `search` as your default tool** - Auto-routes to the best specialized tool
96-
2. **NEVER use grep/cat/find for code exploration** - Use MCP tools instead
97-
3. **Start with `symbol_graph`** for all relationship queries
98-
4. **Use multi-query** for complex searches: pass 2-3 variations
113+
1. **ALWAYS start with `search`** - It is your PRIMARY tool. Auto-routes to the best specialized tool. Only fall back to specific tools when you need params `search` doesn't expose.
114+
2. **NEVER use grep/cat/find for code exploration** - Use MCP tools instead. Only acceptable use: confirming exact literal strings.
115+
3. **Start with `symbol_graph`** for all relationship queries - always available, no Neo4j needed
116+
4. **Use multi-query** for complex searches: pass 2-3 variations as a list
99117
5. **Two-phase search**: Discovery (`limit=3, compact=true`) → Deep dive (`limit=8, include_snippet=true`)
100118
6. **Fire parallel calls** - Multiple independent `search`, `repo_search`, `symbol_graph` in one message
101119
7. **Set session defaults early**: `set_session_defaults(output_format="toon", compact=true)`
120+
8. **Use TOON format** - `output_format: "toon"` for 60-80% token reduction on exploratory queries
121+
9. **Use `cross_repo_search`** for multi-repo scenarios instead of manual collection switching
122+
10. **Predict co-changing files** - `search_commits_for(path=..., predict_related=true)` finds historically coupled files
123+
124+
## Error Fallbacks
125+
126+
- `context_answer` timeout → `search` + `info_request(include_explanation=true)`
127+
- `pattern_search` unavailable → `search` with structural query terms
128+
- `graph_query` unavailable → `symbol_graph` (always available)
129+
- grep/Read File → use `search`, `symbol_graph`, `info_request` instead
102130

103131
## Filters (for repo_search)
104132

skills/context-engine/SKILL.md

Lines changed: 34 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,17 @@ The `query_signature` encodes control flow: `L` (loops), `B` (branches), `T` (tr
300300
{"query": "utils/helpers", "limit": 10}
301301
```
302302

303-
**symbol_graph** - Symbol graph navigation (callers / definition / importers):
303+
**symbol_graph** - Symbol graph navigation (callers / callees / definition / importers):
304+
305+
**Query types:**
306+
| Type | Description |
307+
|------|-------------|
308+
| `callers` | Who calls this symbol? |
309+
| `callees` | What does this symbol call? |
310+
| `definition` | Where is this symbol defined? |
311+
| `importers` | Who imports this module/symbol? |
312+
313+
**Examples:**
304314
```json
305315
{"symbol": "ASTAnalyzer", "query_type": "definition", "limit": 10}
306316
```
@@ -310,14 +320,17 @@ The `query_signature` encodes control flow: `L` (loops), `B` (branches), `T` (tr
310320
```json
311321
{"symbol": "qdrant_client", "query_type": "importers", "limit": 10}
312322
```
323+
```json
324+
{"symbol": "authenticate", "query_type": "callees", "limit": 10}
325+
```
313326
- Supports `language`, `under`, `depth`, and `output_format` like other tools.
314327
- Use `depth=2` or `depth=3` for multi-hop traversals (callers of callers).
315328
- If there are no graph hits, it falls back to semantic search.
316329
- **Note**: Results are "hydrated" with ~500-char source snippets for immediate context.
317330

318331
**graph_query** - Advanced graph traversals (OPTIONAL — ONLY available when NEO4J_GRAPH=1 or MEMGRAPH_GRAPH=1):
319332

320-
> **If `graph_query` is not in your MCP tool list, it is NOT enabled. Use `symbol_graph` for all graph queries instead. Do NOT error or warn about missing Neo4j.**
333+
> **If `graph_query` is not in your MCP tool list, it is NOT enabled. Use `symbol_graph` for all graph queries instead. Do NOT error or warn about missing Neo4j/Memgraph.**
321334
322335
```json
323336
{"symbol": "normalize_path", "query_type": "impact", "depth": 2}
@@ -334,12 +347,20 @@ The `query_signature` encodes control flow: `L` (loops), `B` (branches), `T` (tr
334347
|------|-------------|
335348
| `callers` | Who calls this symbol? (depth 1) |
336349
| `callees` | What does this symbol call? (depth 1) |
350+
| `definition` | Where is this symbol defined? |
337351
| `transitive_callers` | Multi-hop callers (up to depth) |
338352
| `transitive_callees` | Multi-hop callees (up to depth) |
339353
| `impact` | What breaks if I change this? (reverse transitive) |
340354
| `dependencies` | What does this depend on? (calls + imports) |
341355
| `cycles` | Detect circular dependencies |
342356

357+
**Parameters:**
358+
- `symbol` - Symbol name to query
359+
- `query_type` - One of the types above
360+
- `depth` - Maximum traversal depth (default 1)
361+
- `limit` - Max results (default 10)
362+
- `include_paths` - Include file paths in results (bool, optional)
363+
343364

344365

345366
**search_commits_for** - Search git history:
@@ -382,26 +403,7 @@ Use `context_search` to blend code results with stored memories:
382403
}
383404
```
384405

385-
## Index Management
386-
387-
**qdrant_index_root** - First-time setup or full reindex:
388-
```json
389-
{}
390-
```
391-
With recreate (drops existing data):
392-
```json
393-
{"recreate": true}
394-
```
395-
396-
**qdrant_index** - Index only a subdirectory:
397-
```json
398-
{"subdir": "src/"}
399-
```
400-
401-
**qdrant_prune** - Remove deleted files from index:
402-
```json
403-
{}
404-
```
406+
## Admin and Diagnostics
405407

406408
**qdrant_status** - Check index health:
407409
```json
@@ -413,23 +415,11 @@ With recreate (drops existing data):
413415
{}
414416
```
415417

416-
## Workspace Tools
417-
418-
**workspace_info** - Get current workspace and collection:
419-
```json
420-
{}
421-
```
422-
423-
**list_workspaces** - List all indexed workspaces:
418+
**embedding_pipeline_stats** - Get cache efficiency, bloom filter stats, pipeline performance:
424419
```json
425420
{}
426421
```
427422

428-
**collection_map** - View collection-to-repo mappings:
429-
```json
430-
{"include_samples": true}
431-
```
432-
433423
**set_session_defaults** - Set defaults for session:
434424
```json
435425
{"collection": "my-project", "language": "python"}
@@ -446,8 +436,6 @@ Don't discover at every session start. Trigger when: search returns no/irrelevan
446436
```json
447437
// qdrant_list — discover available collections
448438
{}
449-
// collection_map — map repos to collections with sample files
450-
{"include_samples": true}
451439
```
452440

453441
### Context Switching (Session Defaults = `cd`)
@@ -459,7 +447,7 @@ Treat `set_session_defaults` like `cd` — it scopes ALL subsequent searches:
459447
{"collection": "backend-api-abc123"}
460448

461449
// One-off peek at another repo (does NOT change session default)
462-
// repo_search
450+
// search (or repo_search)
463451
{"query": "login form", "collection": "frontend-app-def456"}
464452
```
465453

@@ -472,12 +460,12 @@ NEVER search both repos with the same vague query. Find the **interface boundary
472460
**Pattern 1 — Interface Handshake (API/RPC):**
473461
```json
474462
// 1. Find client call in frontend
475-
// repo_search
463+
// search
476464
{"query": "login API call", "collection": "frontend-col"}
477465
// → Found: axios.post('/auth/v1/login', ...)
478466

479467
// 2. Search backend for that exact route
480-
// repo_search
468+
// search
481469
{"query": "'/auth/v1/login'", "collection": "backend-col"}
482470
```
483471

@@ -488,19 +476,19 @@ NEVER search both repos with the same vague query. Find the **interface boundary
488476
{"symbol": "UserProfile", "query_type": "importers", "collection": "frontend-col"}
489477

490478
// 2. Find definition in source
491-
// repo_search
479+
// search
492480
{"query": "interface UserProfile", "collection": "shared-lib-col"}
493481
```
494482

495483
**Pattern 3 — Event Relay (Pub/Sub):**
496484
```json
497485
// 1. Find producer → extract event name
498-
// repo_search
486+
// search
499487
{"query": "publish event", "collection": "service-a-col"}
500488
// → Found: bus.publish("USER_CREATED", payload)
501489

502490
// 2. Find consumer with exact event name
503-
// repo_search
491+
// search
504492
{"query": "'USER_CREATED'", "collection": "service-b-col"}
505493
```
506494

@@ -533,11 +521,11 @@ NEVER search both repos with the same vague query. Find the **interface boundary
533521
// cross_repo_search
534522
{"boundary_key": "/api/auth/login", "collection": "backend-col"}
535523
```
536-
Use `cross_repo_search` when you need breadth across repos. Use `repo_search` with explicit `collection` when you need depth in one repo.
524+
Use `cross_repo_search` when you need breadth across repos. Use `search` (or `repo_search`) with explicit `collection` when you need depth in one repo.
537525

538526
### Multi-Repo Anti-Patterns
539527
- **DON'T** search both repos with the same vague query (noisy, confusing)
540-
- **DON'T** assume the default collection is correct — verify with `collection_map`
528+
- **DON'T** assume the default collection is correct — verify with `qdrant_list`
541529
- **DON'T** forget to "cd back" after cross-referencing another repo
542530
- **DO** extract exact strings (route paths, event names, type names) as search anchors
543531

@@ -578,7 +566,7 @@ Tools return structured errors, typically via `error` field and sometimes `ok: f
578566
```
579567

580568
Common issues:
581-
- **Collection not found** - Run `qdrant_index_root` to create the index
569+
- **Collection not found** - Verify collection with `qdrant_list` or check that the codebase has been indexed
582570
- **Empty results** - Broaden query, check filters, verify index exists
583571
- **Timeout on rerank** - Set `rerank_enabled: false` or reduce `limit`
584572

@@ -592,8 +580,6 @@ Common issues:
592580
6. **Include snippets** - Set `include_snippet: true` to see code context in results
593581
7. **Store decisions** - Use `memory_store` to save architectural decisions and context for later
594582
8. **Check index health** - Run `qdrant_status` if searches return unexpected results
595-
9. **Prune after refactors** - Run `qdrant_prune` after moving/deleting files
596-
10. **Index before search** - Always run `qdrant_index_root` on first use or after cloning a repo
597583
11. **Use pattern_search for structural matching** - When looking for code with similar control flow (retry loops, error handling), use `pattern_search` instead of `repo_search` (if enabled)
598584
12. **Describe patterns in natural language** - `pattern_search` understands "retry with backoff" just as well as actual code examples (if enabled)
599585
13. **Fire independent searches in parallel** - Call multiple `search`, `repo_search`, `symbol_graph`, etc. in the same message block for 2-3x speedup

0 commit comments

Comments
 (0)