Skip to content

Commit 3820b11

Browse files
Copilotobserverw
andauthored
Redesign locate API: auto-marker detection and string syntax (#13)
* Initial plan * Implement auto-marker detection and locate string syntax Co-authored-by: observerw <20661574+observerw@users.noreply.github.com> * Update documentation for new locate API design Co-authored-by: observerw <20661574+observerw@users.noreply.github.com> * Apply linting and formatting to locate API changes Co-authored-by: observerw <20661574+observerw@users.noreply.github.com> * Add comprehensive integration tests for locate API Co-authored-by: observerw <20661574+observerw@users.noreply.github.com> * Address code review feedback Co-authored-by: observerw <20661574+observerw@users.noreply.github.com> * Support simplified scope syntax: line numbers without prefix and comma separator Co-authored-by: observerw <20661574+observerw@users.noreply.github.com> * Remove L prefix syntax for line scopes Co-authored-by: observerw <20661574+observerw@users.noreply.github.com> * Move implementation logic out of lsap_schema to utils module Co-authored-by: observerw <20661574+observerw@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: observerw <20661574+observerw@users.noreply.github.com>
1 parent a1cf355 commit 3820b11

File tree

8 files changed

+650
-69
lines changed

8 files changed

+650
-69
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ The Agent only needs to issue a high-level command without worrying about underl
5757
{
5858
"locate": {
5959
"file_path": "src/utils.py",
60-
"find": "def format_date<HERE>", // Semantic Anchor
60+
"find": "def format_date<|>", // Semantic Anchor
6161
},
6262
"mode": "references",
6363
"max_items": 10,

docs/locate_design.md

Lines changed: 125 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -54,38 +54,77 @@ Narrows the search area from the entire file to a specific region:
5454
Precise location within the Scope using a text pattern:
5555

5656
```python
57-
find = "self.<HERE>value = value"
58-
# ^^^^^^ <HERE> marks the exact position
57+
find = "self.<|>value = value"
58+
# ^^^ <|> marks the exact position
5959
```
6060

61-
**`<HERE>` Marker Rules:**
61+
**Automatic Marker Detection Rules:**
6262

63-
- With `<HERE>`: Locates at the marker position.
64-
- Without `<HERE>`: Locates at the start of the matched text.
65-
- `find` is `None`: Uses the "natural position" of the Scope.
66-
- Custom marker: Use `marker` field when source contains literal `<HERE>`.
63+
- Markers use nested bracket notation: `<|>`, `<<|>>`, `<<<|>>>`, etc.
64+
- The system automatically detects the marker with the deepest nesting level that appears exactly once
65+
- With marker: Locates at the marker position
66+
- Without marker: Locates at the start of the matched text
67+
- `find` is `None`: Uses the "natural position" of the Scope
6768

6869
```python
69-
# Default marker
70-
Locate(file_path="foo.py", find="self.<HERE>value")
70+
# Basic marker
71+
Locate(file_path="foo.py", find="self.<|>value")
7172

72-
# Custom marker when source contains "<HERE>"
73-
Locate(file_path="foo.py", find="x = <|>value", marker="<|>")
73+
# When <|> appears multiple times, use deeper nesting
74+
Locate(file_path="foo.py", find="x = <|> + y <<|>> z")
75+
# Will automatically use <<|>> as the position marker
76+
77+
# String syntax for concise location specification
78+
locate_str = "foo.py:MyClass.my_method@self.<|>"
79+
locate = parse_locate_string(locate_str)
80+
```
81+
82+
## String Syntax
83+
84+
A concise string syntax is provided for easy location specification:
85+
86+
**Format:** `<file_path>:<scope>@<find>`
87+
88+
**Scope formats:**
89+
- `<line>` - Single line number (e.g., `42`)
90+
- `<start>,<end>` - Line range with comma (e.g., `10,20`)
91+
- `<start>-<end>` - Line range with dash (e.g., `10-20`)
92+
- `<symbol_path>` - Symbol path with dots (e.g., `MyClass.my_method`)
93+
94+
**Examples:**
95+
```python
96+
# File with find pattern only
97+
"foo.py@self.<|>"
98+
99+
# Line scope with find
100+
"foo.py:42@return <|>result"
101+
102+
# Line range scope (comma or dash)
103+
"foo.py:10,20@if <|>condition"
104+
"foo.py:10-20@if <|>condition"
105+
106+
# Symbol scope with find
107+
"foo.py:MyClass.my_method@self.<|>"
108+
109+
# Symbol scope only (for declaration position)
110+
"foo.py:MyClass"
111+
```
112+
"foo.py:L10-20@if <|>condition"
74113
```
75114
76115
## Position Resolution Rules
77116
78-
| Scope | Find | Resolution Result |
79-
| ------------- | ---------------- | -------------------------------------------- |
80-
| `SymbolScope` | `None` | Position of the symbol's declared name |
81-
| `SymbolScope` | With `<HERE>` | Marked position within the symbol body |
82-
| `SymbolScope` | Without `<HERE>` | Start of matched text within the symbol body |
83-
| `LineScope` | `None` | First non-whitespace character of the line |
84-
| `LineScope` | With `<HERE>` | Marked position within the line |
85-
| `LineScope` | Without `<HERE>` | Start of matched text within the line |
86-
| `None` | With `<HERE>` | Global search, marked position |
87-
| `None` | Without `<HERE>` | Global search, start of matched text |
88-
| `None` | `None` | ❌ Invalid, validation should failure |
117+
| Scope | Find | Resolution Result |
118+
| ------------- | ------------- | -------------------------------------------- |
119+
| `SymbolScope` | `None` | Position of the symbol's declared name |
120+
| `SymbolScope` | With marker | Marked position within the symbol body |
121+
| `SymbolScope` | Without marker| Start of matched text within the symbol body |
122+
| `LineScope` | `None` | First non-whitespace character of the line |
123+
| `LineScope` | With marker | Marked position within the line |
124+
| `LineScope` | Without marker| Start of matched text within the line |
125+
| `None` | With marker | Global search, marked position |
126+
| `None` | Without marker| Global search, start of matched text |
127+
| `None` | `None` | ❌ Invalid, validation should failure |
89128
90129
## Whitespace Handling
91130
@@ -129,14 +168,14 @@ Token boundaries align with programming language semantics. It preserves identif
129168
130169
### Capabilities Requiring Position
131170
132-
| LSP Capability | Positioning Need | Locate Usage |
133-
| ---------------------------- | --------------------------- | ------------------------------------------ |
134-
| `textDocument/definition` | Identifier position | `SymbolScope` or `find="<HERE>identifier"` |
135-
| `textDocument/references` | Symbol declaration position | `SymbolScope(symbol_path=[...])` |
136-
| `textDocument/rename` | Symbol declaration position | `SymbolScope(symbol_path=[...])` |
137-
| `textDocument/hover` | Any identifier | `find="<HERE>target"` |
138-
| `textDocument/completion` | Trigger point | `find="obj.<HERE>"` |
139-
| `textDocument/signatureHelp` | Inside parentheses | `find="func(<HERE>"` |
171+
| LSP Capability | Positioning Need | Locate Usage |
172+
| ---------------------------- | --------------------------- | ----------------------------------- |
173+
| `textDocument/definition` | Identifier position | `find="<|>identifier"` |
174+
| `textDocument/references` | Symbol declaration position | `SymbolScope(symbol_path=[...])` |
175+
| `textDocument/rename` | Symbol declaration position | `SymbolScope(symbol_path=[...])` |
176+
| `textDocument/hover` | Any identifier | `find="<|>target"` |
177+
| `textDocument/completion` | Trigger point | `find="obj.<|>"` |
178+
| `textDocument/signatureHelp` | Inside parentheses | `find="func(<|>"` |
140179
141180
### Capabilities Requiring Range
142181
@@ -166,8 +205,11 @@ The resolver treats `SymbolScope` as the declaration position of the class name
166205
Locate(
167206
file_path="utils.py",
168207
scope=SymbolScope(symbol_path=["process"]),
169-
find="return <HERE>result"
208+
find="return <|>result"
170209
)
210+
211+
# Or using string syntax:
212+
parse_locate_string("utils.py:process@return <|>result")
171213
```
172214

173215
First narrows down to the `process` function body, then searches for `return result` within it, positioning at the start of `result`.
@@ -178,8 +220,11 @@ First narrows down to the `process` function body, then searches for `return res
178220
# Locate the position after 'self.'
179221
Locate(
180222
file_path="service.py",
181-
find="self.<HERE>"
223+
find="self.<|>"
182224
)
225+
226+
# Or using string syntax:
227+
parse_locate_string("service.py@self.<|>")
183228
```
184229

185230
Global search for `self.`, positioning right after the dot to trigger member completion.
@@ -193,6 +238,9 @@ LocateRange(
193238
scope=SymbolScope(symbol_path=["handle_request"])
194239
)
195240

241+
# Or using string syntax for SymbolScope:
242+
parse_locate_string("handlers.py:handle_request")
243+
196244
# Or select a specific code snippet
197245
LocateRange(
198246
file_path="handlers.py",
@@ -211,7 +259,7 @@ In some cases, the Agent knows the text pattern to search for but isn't sure whi
211259
Locate(file_path="main.py", find="TODO: <HERE>fix this")
212260
```
213261

214-
### Q2: Why is `<HERE>` optional?
262+
### Q2: Why is the marker optional?
215263

216264
Often, locating at the start of the matched text is sufficient:
217265

@@ -220,7 +268,7 @@ Often, locating at the start of the matched text is sufficient:
220268
Locate(file_path="old.py", find="deprecated_func(")
221269
```
222270

223-
Forcing `<HERE>` everywhere would increase the cognitive load on the Agent.
271+
Forcing a marker everywhere would increase the cognitive load on the Agent.
224272

225273
### Q3: Why does SymbolScope without Find locate the declaration?
226274

@@ -242,18 +290,55 @@ Locate(file_path="mod.py", scope=SymbolScope(symbol_path=["MyClass"]))
242290

243291
While a `Range` can be constructed from two `Position`s, they represent different semantic operations. Modeling them separately is clearer.
244292

245-
### Q6: What if source code contains literal `<HERE>`?
293+
### Q5: Why automatic marker detection with nested brackets?
294+
295+
The automatic marker detection using nested brackets (`<|>`, `<<|>>`, etc.) provides:
296+
297+
1. **No configuration needed**: Agents don't need to specify a custom marker field
298+
2. **Conflict resolution**: If the code contains `<|>`, agents can simply use `<<|>>`
299+
3. **Clear hierarchy**: The nesting levels make it obvious which marker is intended
300+
4. **Simple rule**: "Use the deepest unique marker" is easy to understand
301+
302+
```python
303+
# When <|> appears in the source code
304+
Locate(file_path="parser.py", find="token = <|> HERE <<|>> value")
305+
# Automatically uses <<|>> as the position marker
306+
```
307+
308+
### Q6: Why add string syntax?
309+
310+
The string syntax `<file_path>:<scope>@<find>` provides a concise format that:
311+
312+
1. **Reduces verbosity**: Agents can generate shorter location strings
313+
2. **Human-readable**: Easy to read and understand at a glance
314+
3. **Copy-paste friendly**: Can be easily shared in logs or documentation
315+
4. **Optional but convenient**: The full object API is still available
316+
317+
```python
318+
# Compact string format
319+
"foo.py:MyClass.method@return <|>result"
320+
321+
# Equivalent to:
322+
Locate(
323+
file_path="foo.py",
324+
scope=SymbolScope(symbol_path=["MyClass", "method"]),
325+
find="return <|>result"
326+
)
327+
```
328+
329+
### Q7: What if I need multiple markers in the same text?
246330

247-
The `marker` field allows customization:
331+
The automatic marker detection supports up to 10 nesting levels. If you need to specify a position in text that contains multiple markers, use a deeper nesting level:
248332

249333
```python
250-
# Source contains "<HERE>" as actual code
251-
Locate(file_path="parser.py", find="token = <|>HERE", marker="<|>")
334+
# If your code contains both <|> and <<|>>
335+
Locate(file_path="parser.py", find="a <|> b <<|>> c <<<|>>> d")
336+
# Use <<<|>>> as the position marker
252337
```
253338

254-
This keeps the marker-based positioning intuitive without requiring the Agent to calculate offsets.
339+
Only one marker should appear exactly once in the find text to be used as the position marker.
255340

256-
### Q7: Text matching rules in Find?
341+
### Q8: Text matching rules in Find?
257342

258343
- Matching is **literal** (not regex), but with intelligent whitespace handling (see [Whitespace Handling](#whitespace-handling)).
259344
- Returns the **first match** within the Scope.

docs/schemas/locate.md

Lines changed: 61 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -9,18 +9,46 @@ The `Locate` model uses a two-stage approach: **scope** (optional) → **find**
99
### Resolution Rules
1010

1111
1. **SymbolScope without find**: Returns the symbol declaration position (for references, rename)
12-
2. **With find containing marker**: Returns the marker position
12+
2. **With find containing marker**: Returns the marker position (auto-detected using nested brackets)
1313
3. **With find only**: Returns the start of matched text
1414
4. **No scope + find**: Searches the entire file
1515

16+
### Automatic Marker Detection
17+
18+
Markers are automatically detected using nested bracket notation:
19+
- `<|>` - Single level (default)
20+
- `<<|>>` - Double level (if `<|>` appears multiple times)
21+
- `<<<|>>>` - Triple level (if `<<|>>` appears multiple times)
22+
- ... up to 10 nesting levels
23+
24+
The system automatically selects the marker with the deepest nesting level that appears exactly once in the find text.
25+
26+
### String Syntax
27+
28+
A concise string syntax is available: `<file_path>:<scope>@<find>`
29+
30+
**Scope formats:**
31+
- `<line>` - Single line number (e.g., `42`)
32+
- `<start>,<end>` - Line range with comma (e.g., `10,20`)
33+
- `<start>-<end>` - Line range with dash (e.g., `10-20`)
34+
- `<symbol_path>` - Symbol path with dots (e.g., `MyClass.my_method`)
35+
36+
**Examples:**
37+
```
38+
foo.py@self.<|>
39+
foo.py:42@return <|>result
40+
foo.py:10,20@if <|>condition
41+
foo.py:MyClass.my_method@self.<|>
42+
foo.py:MyClass
43+
```
44+
1645
### Locate Fields
1746

18-
| Field | Type | Default | Description |
19-
| :---------- | :------------------------------------- | :--------- | :--------------------------------------------------------------------- |
20-
| `file_path` | `string` | Required | Path to search in. |
21-
| `scope` | `LineScope` \| `SymbolScope` \| `null` | `null` | Optional: narrow search to symbol body or line range. |
22-
| `find` | `string` \| `null` | `null` | Text pattern with marker for exact position. |
23-
| `marker` | `string` | `"<HERE>"` | Position marker in find pattern. Change if source contains `"<HERE>"`. |
47+
| Field | Type | Default | Description |
48+
| :---------- | :------------------------------------- | :------- | :---------------------------------------------------- |
49+
| `file_path` | `string` | Required | Path to search in. |
50+
| `scope` | `LineScope` \| `SymbolScope` \| `null` | `null` | Optional: narrow search to symbol body or line range. |
51+
| `find` | `string` \| `null` | `null` | Text pattern with auto-detected marker. |
2452

2553
### LineScope
2654

@@ -94,29 +122,45 @@ For selecting a range of text instead of a point.
94122
}
95123
```
96124

125+
Or using string syntax:
126+
```
127+
"foo.py:MyClass"
128+
```
129+
97130
### Scenario 2: Completion trigger point (with marker)
98131

99132
```json
100133
{
101134
"locate": {
102135
"file_path": "foo.py",
103-
"find": "self.<HERE>"
136+
"find": "self.<|>"
104137
}
105138
}
106139
```
107140

108-
### Scenario 3: Using custom marker (when source contains "<HERE>")
141+
Or using string syntax:
142+
```
143+
"foo.py@self.<|>"
144+
```
145+
146+
### Scenario 3: Nested marker when <|> exists in source
109147

110148
```json
111149
{
112150
"locate": {
113151
"file_path": "foo.py",
114-
"find": "x = <|>value",
115-
"marker": "<|>"
152+
"find": "x = <|> + y <<|>> z"
116153
}
117154
}
118155
```
119156

157+
The system automatically detects `<<|>>` as the unique position marker.
158+
159+
Or using string syntax:
160+
```
161+
"foo.py@x = <|> + y <<|>> z"
162+
```
163+
120164
### Scenario 4: Finding a location within a specific symbol
121165

122166
```json
@@ -126,11 +170,16 @@ For selecting a range of text instead of a point.
126170
"scope": {
127171
"symbol_path": ["process"]
128172
},
129-
"find": "return <HERE>result"
173+
"find": "return <|>result"
130174
}
131175
}
132176
```
133177

178+
Or using string syntax:
179+
```
180+
"foo.py:process@return <|>result"
181+
```
182+
134183
#### Markdown Rendered for LLM
135184

136185
```markdown

python/src/lsap/locate.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919

2020
from lsap.exception import NotFoundError
2121
from lsap.utils.document import DocumentReader
22+
from lsap.utils.locate import detect_marker
2223
from lsap.utils.symbol import iter_symbols
2324

2425
from .abc import Capability
@@ -106,8 +107,12 @@ async def __call__(self, req: LocateRequest) -> LocateResponse | None:
106107
pos: LSPPosition | None = None
107108

108109
if locate.find:
109-
if locate.marker in locate.find:
110-
before, _, after = locate.find.partition(locate.marker)
110+
# Auto-detect marker in the find text
111+
marker_info = detect_marker(locate.find)
112+
113+
if marker_info:
114+
marker, _, _ = marker_info
115+
before, _, after = locate.find.partition(marker)
111116
re_before, re_after = _to_regex(before), _to_regex(after)
112117

113118
if not re_before and not re_after:

0 commit comments

Comments
 (0)