experiment proposal

# Semantic search & cross-referencing for coding agents

Models can write sophisticated code, but when they need to understand an existing codebase they’re reduced to scanning files and making educated guesses. Coding agents can generate code yet lack the semantic and structural intelligence developers rely on for safe, effective editing. They have partial capabilities without diagnostic tools.

Developers navigate large codebases using symbol resolution, call graphs, and type information — can we give agents the same tooling?


### Experiment

> The proposal is to use progressive fallback for retrieval and analysis:
>
> [SCIP](https://github.com/sourcegraph/scip) → LSP → Tree-sitter → ctags → grep/rg.

* **SCIP**: semantic, cross-referenced, scales well for global analysis. Use as primary for symbol resolution and cross-references.
* **LSP**: authoritative for live editing and workspace-local semantics (types, hover info).
* **Tree-sitter**: reliable structural parse to build AST-based maps when LSP isn’t present.
* **ctags**: cheap symbol map for languages lacking LSP/tree-sitter coverage.
* **rg/grep**: last-resort for literal search, quick and scalable.

This can give coding agents the same semantic view developers use: resolved symbols, call graphs, and structural indexes. Combine semantic indexes (SCIP / embeddings) with live structural sources (LSP, tree-sitter, ctags) and a simple grep fallback so agents return relevant, precise, and safe edits at parity with keyword search latency.

### Check Metric
Overall agent performance should be no worse than the current keyword search baseline.

### Tools Available

1. Repo Indexing

   * SCIP-style semantic index (symbols + references).
   * embeddings index for relevance and exploratory queries.
   * LSP snapshots for file-level semantics; tree-sitter/ctags for structural maps; rg for raw text matches.

2. Symbol & Call-Graph Store

   * Normalized symbol IDs, definitions locations, signatures, inferred types, and cross-references.

3. Vector DB

   * Stores embeddings for doc/snippet/symbol contexts for relevance scoring.

4. Real-time state priority

   * For open files or live editing sessions — implicit or higher priority for current workspace state.



### Tool pros/cons matrix (from [Grok](https://grok.com/share/bGVnYWN5LWNvcHk%3D_49f3b595-a485-47b4-84ab-19eb7a2bb506))

| Mode                                 |       Speed |   Accuracy | Scalability | Best for                                            |
| ------------------------------------ | ----------: | ---------: | ----------: | --------------------------------------------------- |
| Fuzzy / Grep                         |        High | Low–Medium |        High | Quick literal searches in very large repos          |
| Semantic / Embeddings                |      Medium |       High |      Medium | Exploratory queries, relevance-focused retrieval    |
| Hybrid (SCIP+embeddings)             | Medium–High |       High |        High | Balanced agent workflows both symbolic & semantic |
| Agentic (with diagnostics)           |    Variable |       High |      Medium | Complex, adaptive tasks requiring inference & tests |
| LSP / Tree-Sitter                    |      Medium |       High |      Medium | Structural analysis & IDE-like integrations         |
| MCP-Enhanced / Enterprise connectors |      Medium |       High |        High | Modular enterprise data & private sources           |

### What existing coding agents do

* **Cursor(in VSCode)**: hybrid — contextual by embeddings + workspace snippets.
* **Claude Code**: largely grep-based retrieval (fast literal hits).
* **GitHub Copilot ( in VSCode)**: uses LSP + repo context for semantic suggestions in-editor.
* **gemini-cli**: grep-based with potential to upgrade.
* **codex-cli**: grep/hybrid via plugins.
* **Aider**: repo mapping with tree-sitter and ctags (structural approach).
* **Amp**: orchestrates a search subagent for retrieval.

### Evaluation Criteria

* **Functional parity** vs existing keyword search (baseline): match or exceed recall@10 and precision@3 for developer queries.
* **Latency**: p95 for lookups ≤ baseline. (Measure empirically.)
* **Safety**: % of automated edits that require human review (target decrease vs naive LLM edits).
* **Cost**: repo indexing compute cost.


### Notes
* We can provide tools like `symbol_lookup`, `find_references`, `semantic_search` via a cli or mcp integration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiment proposal #1

Semantic search & cross-referencing for coding agents

Experiment

Check Metric

Tools Available

Tool pros/cons matrix (from Grok)

What existing coding agents do

Evaluation Criteria

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mode	Speed	Accuracy	Scalability	Best for
Fuzzy / Grep	High	Low–Medium	High	Quick literal searches in very large repos
Semantic / Embeddings	Medium	High	Medium	Exploratory queries, relevance-focused retrieval
Hybrid (SCIP+embeddings)	Medium–High	High	High	Balanced agent workflows both symbolic & semantic
Agentic (with diagnostics)	Variable	High	Medium	Complex, adaptive tasks requiring inference & tests
LSP / Tree-Sitter	Medium	High	Medium	Structural analysis & IDE-like integrations
MCP-Enhanced / Enterprise connectors	Medium	High	High	Modular enterprise data & private sources

experiment proposal #1

Description

Semantic search & cross-referencing for coding agents

Experiment

Check Metric

Tools Available

Tool pros/cons matrix (from Grok)

What existing coding agents do

Evaluation Criteria

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions