Skip to content

Commit 484e3ef

Browse files
committed
Add batch tools and graph_query docs
Add documentation and examples for batch_search, batch_symbol_graph, and batch_graph_query (max 10 queries per batch, ~75% token savings) and expand symbol_graph to include subclasses/base_classes. Clarify graph_query availability (Memgraph-backed, available to SaaS users) and its query types (impact, transitive callers/callees, cycles, dependencies, etc.) and add include_paths/depth options and examples. Update index management to state SaaS indexing is automatic (qdrant_index*, qdrant_prune not available in SaaS), add qdrant_list and embedding_pipeline_stats, and mark qdrant_index/root/prune as self-hosted only. Update best-practice guidance and MCP tool-selection to recommend search as the default, document batch usage and parallel call patterns, and adjust various examples and reference tables accordingly.
1 parent 7493182 commit 484e3ef

File tree

4 files changed

+364
-68
lines changed

4 files changed

+364
-68
lines changed

.codex/skills/context-engine/SKILL.md

Lines changed: 84 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -24,13 +24,20 @@ Need to find code?
2424
├── Find relationships
2525
│ ├── Who calls / who imports / where defined → symbol_graph (DEFAULT, always available)
2626
│ ├── What does this call → symbol_graph (query_type="callees")
27+
│ ├── Find subclasses / base classes → symbol_graph (query_type="subclasses" or "base_classes")
2728
│ ├── Multi-hop (callers of callers) → symbol_graph (depth=2+)
28-
│ └── Impact analysis / cycles → graph_query (ONLY if NEO4J/MEMGRAPH enabled)
29+
│ ├── Impact analysis (what breaks if I change X) → graph_query (ONLY if available)
30+
│ ├── Dependency graph → graph_query (ONLY if available)
31+
│ └── Circular dependency detection → graph_query (ONLY if available)
2932
├── Git history
3033
│ ├── Find commits → search_commits_for
3134
│ └── Predict co-changing files → search_commits_for (predict_related=true)
3235
├── Blend code + notes → context_search (include_memories=true)
33-
└── Store/recall knowledge → memory_store, memory_find
36+
├── Store/recall knowledge → memory_store, memory_find
37+
└── Multiple independent queries at once
38+
├── batch_search (runs N repo_search calls in one invocation, ~75% token savings)
39+
├── batch_symbol_graph (runs N symbol_graph queries in one invocation)
40+
└── batch_graph_query (runs N graph_query queries in one invocation)
3441
```
3542

3643
## Primary Tools
@@ -46,31 +53,78 @@ Auto-detects intent and routes to the best tool. Returns `{ok, intent, confidenc
4653

4754
Optional params: `query`, `collection`, `limit`, `language`, `under`, `include_snippet`, `compact`, `context_lines`, `ext`, `not_glob`, `path_glob`, `output_format`, `rerank_enabled`.
4855

49-
Use specialized tools directly only for: cross-repo search, memory, admin, or when you need params `search` doesn't expose.
56+
Use specialized tools directly only for: cross-repo search, batch search, memory, admin, or when you need params `search` doesn't expose.
57+
58+
**batch_search** - Run N independent searches in one call (~75% token savings):
59+
```json
60+
{
61+
"searches": [
62+
{"query": "authentication middleware", "limit": 5},
63+
{"query": "rate limiting implementation", "limit": 5},
64+
{"query": "error handling patterns"}
65+
],
66+
"compact": true,
67+
"output_format": "toon"
68+
}
69+
```
70+
Returns `{ok, batch_results: [result_set_0, ...], count, elapsed_ms}`. Max 10 searches per batch. Shared params (`collection`, `limit`, `language`, etc.) apply to all searches unless overridden per-search. Use when you have 2+ independent code searches; use individual `search` calls when you need intent routing or searches depend on each other.
5071

5172
**repo_search** - Direct code search (full control):
5273
```json
5374
{"query": "authentication middleware", "limit": 10, "include_snippet": true}
5475
```
5576
Multi-query: `{"query": ["auth handler", "login validation"]}`
5677

57-
**symbol_graph** - Find callers, callees, definitions, importers (ALWAYS available):
78+
**symbol_graph** - Find callers, callees, definitions, importers, subclasses, base classes (ALWAYS available):
5879
```json
5980
{"symbol": "authenticate", "query_type": "callers", "limit": 10}
6081
{"symbol": "authenticate", "query_type": "callees", "limit": 10}
6182
{"symbol": "UserService", "query_type": "definition"}
6283
{"symbol": "utils", "query_type": "importers"}
84+
{"symbol": "BaseModel", "query_type": "subclasses"}
85+
{"symbol": "MyClass", "query_type": "base_classes"}
6386
```
64-
Query types: `callers`, `callees`, `definition`, `importers`. Use `depth=2` for multi-hop. Falls back to semantic search if no graph hits. Results include ~500-char source snippets.
65-
66-
**graph_query** (OPTIONAL -- only if NEO4J_GRAPH=1 or MEMGRAPH_GRAPH=1):
67-
Extra query types: `transitive_callers`, `transitive_callees`, `impact`, `dependencies`, `cycles`. If not in your tool list, use `symbol_graph` instead.
87+
Query types: `callers`, `callees`, `definition`, `importers`, `subclasses`, `base_classes`. Use `depth=2` for multi-hop. Falls back to semantic search if no graph hits. Results include ~500-char source snippets.
6888

6989
**context_answer** - LLM-generated explanation with citations:
7090
```json
7191
{"query": "How does the caching layer work?", "budget_tokens": 2000}
7292
```
7393

94+
**graph_query** - Advanced graph traversals and impact analysis (available to all SaaS users):
95+
```json
96+
{"symbol": "UserService", "query_type": "impact", "depth": 3}
97+
{"symbol": "auth_module", "query_type": "cycles"}
98+
{"symbol": "processPayment", "query_type": "transitive_callers", "depth": 2}
99+
```
100+
Query types: `callers`, `callees`, `transitive_callers`, `transitive_callees`, `impact`, `dependencies`, `definition`, `cycles`. Use `include_paths=true` for full traversal paths. Memgraph-backed; `symbol_graph` (Qdrant-backed) is always available as fallback.
101+
102+
**batch_symbol_graph** - Run N independent symbol_graph queries in one call (~75% token savings):
103+
```json
104+
{
105+
"queries": [
106+
{"symbol": "authenticate", "query_type": "callers"},
107+
{"symbol": "CacheManager", "query_type": "definition"},
108+
{"symbol": "BaseModel", "query_type": "subclasses"}
109+
],
110+
"limit": 10
111+
}
112+
```
113+
Returns `{ok, batch_results: [result_set_0, ...], count, elapsed_ms}`. Max 10 queries per batch. Each query must have a `symbol` key. Shared params (`collection`, `language`, `under`, `repo`, `limit`, `depth`) apply to all unless overridden per-query.
114+
115+
**batch_graph_query** - Run N independent graph_query queries in one call (~75% token savings):
116+
```json
117+
{
118+
"queries": [
119+
{"symbol": "User", "query_type": "impact", "depth": 3},
120+
{"symbol": "auth", "query_type": "cycles"},
121+
{"symbol": "PaymentService", "query_type": "transitive_callers"}
122+
],
123+
"limit": 15
124+
}
125+
```
126+
Returns `{ok, batch_results: [result_set_0, ...], count, elapsed_ms}`. Max 10 queries per batch. Shared params (`collection`, `repo`, `language`, `depth`, `limit`, `include_paths`) apply to all unless overridden per-query.
127+
74128
**info_request** - Quick natural language lookup:
75129
```json
76130
{"info_request": "how does user auth work", "include_explanation": true}
@@ -101,21 +155,34 @@ Extra query types: `transitive_callers`, `transitive_callees`, `impact`, `depend
101155
| `change_history_for_path` | File change summary | `{"path": "src/auth.py", "include_commits": true}` |
102156
| `pattern_search` | Similar code patterns (if enabled) | `{"query": "retry with backoff"}` |
103157
| `search_importers_for` | Find importers | `{"query": "utils/helpers"}` |
158+
| `graph_query` | Advanced graph traversals / impact analysis | `{"symbol": "User", "query_type": "impact", "depth": 3}` |
159+
| `batch_search` | N searches in one call | `{"searches": [{"query": "auth"}, {"query": "cache"}]}` |
160+
| `batch_symbol_graph` | N symbol_graph queries in one call | `{"queries": [{"symbol": "auth", "query_type": "callers"}, {"symbol": "Cache", "query_type": "definition"}]}` |
161+
| `batch_graph_query` | N graph_query queries in one call | `{"queries": [{"symbol": "User", "query_type": "impact"}, {"symbol": "auth", "query_type": "cycles"}]}` |
104162

105163
## Index Management
106164

107-
- `qdrant_index_root` - Index workspace (run first!)
165+
> **SaaS mode:** Indexing is handled automatically by the VS Code extension upload service. `qdrant_index_root`, `qdrant_index`, and `qdrant_prune` are **not available** in SaaS. All search, symbol graph, memory, and session tools work normally.
166+
167+
**Available in all modes:**
108168
- `qdrant_status` - Check index health
109-
- `qdrant_prune` - Remove deleted files
169+
- `qdrant_list` - List all collections
170+
- `set_session_defaults` - Set collection, output_format, compact, limit
171+
- `embedding_pipeline_stats` - Cache efficiency, bloom filter stats
172+
173+
**Self-hosted only (not available in SaaS):**
174+
- `qdrant_index_root` - Index workspace
175+
- `qdrant_index` - Index subdirectory
176+
- `qdrant_prune` - Remove stale entries from deleted files
110177

111178
## Best Practices
112179

113180
1. **ALWAYS start with `search`** - It is your PRIMARY tool. Auto-routes to the best specialized tool. Only fall back to specific tools when you need params `search` doesn't expose.
114181
2. **NEVER use grep/cat/find for code exploration** - Use MCP tools instead. Only acceptable use: confirming exact literal strings.
115-
3. **Start with `symbol_graph`** for all relationship queries - always available, no Neo4j needed
182+
3. **Start with `symbol_graph`** for relationship queries - always available. Use `graph_query` for advanced traversals: impact analysis, circular dependencies, transitive callers/callees (available to all SaaS users)
116183
4. **Use multi-query** for complex searches: pass 2-3 variations as a list
117184
5. **Two-phase search**: Discovery (`limit=3, compact=true`) → Deep dive (`limit=8, include_snippet=true`)
118-
6. **Fire parallel calls** - Multiple independent `search`, `repo_search`, `symbol_graph` in one message
185+
6. **Fire parallel calls** - Multiple independent `search`, `repo_search`, `symbol_graph` in one message. Or use batch tools (`batch_search`, `batch_symbol_graph`, `batch_graph_query`) to run N queries in a single invocation with ~75% token savings
119186
7. **Set session defaults early**: `set_session_defaults(output_format="toon", compact=true)`
120187
8. **Use TOON format** - `output_format: "toon"` for 60-80% token reduction on exploratory queries
121188
9. **Use `cross_repo_search`** for multi-repo scenarios instead of manual collection switching
@@ -125,8 +192,8 @@ Extra query types: `transitive_callers`, `transitive_callees`, `impact`, `depend
125192

126193
- `context_answer` timeout → `search` + `info_request(include_explanation=true)`
127194
- `pattern_search` unavailable → `search` with structural query terms
128-
- `graph_query` unavailable → `symbol_graph` (always available)
129-
- grep/Read File → use `search`, `symbol_graph`, `info_request` instead
195+
- `graph_query` unavailable → `symbol_graph` (always available, Qdrant-backed)
196+
- grep/Read File → use `search`, `symbol_graph`, `info_request` instead
130197

131198
## Filters (for repo_search)
132199

@@ -145,8 +212,8 @@ Don't discover at every session start. Trigger when: search returns no/irrelevan
145212
```json
146213
// qdrant_list — discover available collections
147214
{}
148-
// collection_mapmap repos to collections with sample files
149-
{"include_samples": true}
215+
// cross_repo_searchauto-discover and search across repos
216+
{"query": "your search", "discover": "always"}
150217
```
151218

152219
### Context Switching (Session Defaults = `cd`)
@@ -196,7 +263,7 @@ Use `cross_repo_search` when you need breadth across repos. Use `repo_search` wi
196263

197264
### Anti-Patterns
198265
- DON'T search both repos with the same vague query
199-
- DON'T assume the default collection is correct — verify with `collection_map`
266+
- DON'T assume the default collection is correct — verify with `qdrant_list`
200267
- DO extract exact strings (routes, event names, types) as search anchors
201268

202269
## References

.codex/skills/context-engine/references/tool-reference.md

Lines changed: 82 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ AST-backed symbol relationship queries. Always available.
4848
| Parameter | Type | Description |
4949
|-----------|------|-------------|
5050
| `symbol` | string | Symbol to analyze |
51-
| `query_type` | string | "callers", "definition", "importers", "callees" |
51+
| `query_type` | string | "callers", "definition", "importers", "callees", "subclasses", "base_classes" |
5252
| `depth` | int | Traversal depth (1=direct, 2+=multi-hop) |
5353
| `limit` | int | Max results (default 20) |
5454
| `language` | string | Filter by language |
@@ -93,6 +93,75 @@ Structural code pattern matching. May not be enabled in all deployments.
9393
| `min_score` | float | Minimum similarity (default 0.3) |
9494
| `aroma_rerank` | bool | AROMA structural reranking |
9595

96+
## graph_query
97+
98+
Advanced Memgraph-backed graph traversals and impact analysis. Available to all SaaS users.
99+
100+
| Parameter | Type | Description |
101+
|-----------|------|-------------|
102+
| `symbol` | string | Symbol to analyze |
103+
| `query_type` | string | "callers", "callees", "transitive_callers", "transitive_callees", "impact", "dependencies", "definition", "cycles" |
104+
| `depth` | int | Max traversal depth (default varies by query type) |
105+
| `limit` | int | Max results (default 20) |
106+
| `language` | string | Filter by language |
107+
| `under` | string | Path prefix filter |
108+
| `repo` | string | Repository filter |
109+
| `include_paths` | bool | Include full traversal paths in results |
110+
| `output_format` | string | "json" or "toon" |
111+
112+
## batch_search
113+
114+
Run N independent `repo_search` calls in one MCP invocation. ~75% token savings.
115+
116+
| Parameter | Type | Description |
117+
|-----------|------|-------------|
118+
| `searches` | list[dict] | List of search specs (each with at least a `query` key) |
119+
| `collection` | string | Shared collection (overridable per-search) |
120+
| `limit` | int | Shared max results (overridable per-search) |
121+
| `language` | string | Shared language filter |
122+
| `under` | string | Shared path prefix filter |
123+
| `repo` | string/list | Shared repository filter |
124+
| `include_snippet` | bool | Shared snippet toggle |
125+
| `rerank_enabled` | bool | Shared reranking toggle |
126+
| `output_format` | string | "json" or "toon" |
127+
| `compact` | bool | Minimal response fields |
128+
129+
**Returns:** `{ok, batch_results: [result_set_0, ...], count, elapsed_ms}`. Max 10 searches per batch.
130+
131+
## batch_symbol_graph
132+
133+
Run N independent `symbol_graph` queries in one MCP invocation. ~75% token savings.
134+
135+
| Parameter | Type | Description |
136+
|-----------|------|-------------|
137+
| `queries` | list[dict] | List of query specs (each must have a `symbol` key) |
138+
| `collection` | string | Shared collection (overridable per-query) |
139+
| `language` | string | Shared language filter |
140+
| `under` | string | Shared path prefix filter |
141+
| `repo` | string | Shared repository filter |
142+
| `limit` | int | Shared max results |
143+
| `depth` | int | Shared traversal depth |
144+
| `output_format` | string | "json" or "toon" |
145+
146+
**Returns:** `{ok, batch_results: [result_set_0, ...], count, elapsed_ms}`. Max 10 queries per batch.
147+
148+
## batch_graph_query
149+
150+
Run N independent `graph_query` calls in one MCP invocation. ~75% token savings.
151+
152+
| Parameter | Type | Description |
153+
|-----------|------|-------------|
154+
| `queries` | list[dict] | List of query specs (each must have a `symbol` key) |
155+
| `collection` | string | Shared collection (overridable per-query) |
156+
| `repo` | string | Shared repository filter |
157+
| `language` | string | Shared language filter |
158+
| `depth` | int | Shared traversal depth |
159+
| `limit` | int | Shared max results |
160+
| `include_paths` | bool | Shared include traversal paths |
161+
| `output_format` | string | "json" or "toon" |
162+
163+
**Returns:** `{ok, batch_results: [result_set_0, ...], count, elapsed_ms}`. Max 10 queries per batch.
164+
96165
## Memory Tools
97166

98167
**memory_store**
@@ -118,9 +187,16 @@ Structural code pattern matching. May not be enabled in all deployments.
118187

119188
## Index Management
120189

121-
**qdrant_index_root** - `{"recreate": true}` to drop existing data
122-
**qdrant_index** - `{"subdir": "src/"}` for partial index
123-
**qdrant_prune** - Remove stale entries
124-
**qdrant_status** - Check health
125-
**set_session_defaults** - Set collection, output_format, compact, limit
190+
> **SaaS mode:** Indexing is handled automatically by the VS Code extension upload service. `qdrant_index_root`, `qdrant_index`, and `qdrant_prune` are **not available** in SaaS. All search, symbol graph, memory, and session tools work normally.
191+
192+
**Available in all modes:**
193+
- **qdrant_status** - Check health
194+
- **qdrant_list** - List all collections (alias for `qdrant_status(list_all=True)`)
195+
- **set_session_defaults** - Set collection, output_format, compact, limit
196+
- **embedding_pipeline_stats** - Cache efficiency, bloom filter stats, pipeline performance
197+
198+
**Self-hosted only (not available in SaaS):**
199+
- **qdrant_index_root** - `{"recreate": true}` to drop existing data
200+
- **qdrant_index** - `{"subdir": "src/"}` for partial index
201+
- **qdrant_prune** - Remove stale entries
126202

0 commit comments

Comments
 (0)