Skip to content

Commit bd340e5

Browse files
committed
refactor: change project dir
1 parent 00b7769 commit bd340e5

32 files changed

+87
-45
lines changed

.gitignore

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,12 @@
1+
# Python-generated files
2+
__pycache__/
3+
*.py[oc]
4+
build/
5+
dist/
6+
wheels/
7+
*.egg-info
8+
9+
# Virtual environments
10+
.venv
11+
112
references/
File renamed without changes.
File renamed without changes.

docs/DESIGN.md

Whitespace-only changes.
File renamed without changes.
Lines changed: 75 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,26 @@
22

33
## Background
44

5-
Most LSP (Language Server Protocol) capabilities require a precise code `Position` or `Range`. For LLM Agents, providing exact line and column numbers is difficultAgents typically understand code based on semantics rather than precise character offsets.
5+
Most LSP (Language Server Protocol) capabilities require a precise code `Position` or `Range`. For LLM Agents, providing exact line and column numbers is difficult: Agents typically understand code based on semantics rather than precise character offsets.
66

77
### Problems with Traditional Approaches
88

99
**Option A: Direct Line/Column Specification**
10+
1011
```json
11-
{"line": 42, "character": 15}
12+
{ "line": 42, "character": 15 }
1213
```
14+
1315
- Difficult for Agents to accurately calculate column numbers.
1416
- Position becomes invalid after minor code changes.
1517
- Lacks semantic expressiveness.
1618

1719
**Option B: Symbol Path Only**
20+
1821
```json
19-
{"symbol_path": ["MyClass", "my_method"]}
22+
{ "symbol_path": ["MyClass", "my_method"] }
2023
```
24+
2125
- Can only locate symbol declarations.
2226
- Cannot locate specific positions inside a symbol.
2327
- Cannot handle non-symbol locations (e.g., string literals, comments).
@@ -39,11 +43,11 @@ Position = Scope (Narrowing down) + Find (Precise locating)
3943

4044
Narrows the search area from the entire file to a specific region:
4145

42-
| Scope Type | Description | Typical Scenario |
43-
|-----------|------|---------|
44-
| `SymbolScope` | Code range of a symbol | Locating inside a specific function/class |
45-
| `LineScope` | Line number or line range | Locating based on diagnostic information |
46-
| `None` | Entire file | Global search for a text pattern |
46+
| Scope Type | Description | Typical Scenario |
47+
| ------------- | ------------------------- | ----------------------------------------- |
48+
| `SymbolScope` | Code range of a symbol | Locating inside a specific function/class |
49+
| `LineScope` | Line number or line range | Locating based on diagnostic information |
50+
| `None` | Entire file | Global search for a text pattern |
4751

4852
### Stage 2: Find
4953

@@ -55,6 +59,7 @@ find = "self.<HERE>value = value"
5559
```
5660

5761
**`<HERE>` Marker Rules:**
62+
5863
- With `<HERE>`: Locates at the marker position.
5964
- Without `<HERE>`: Locates at the start of the matched text.
6065
- `find` is `None`: Uses the "natural position" of the Scope.
@@ -70,37 +75,75 @@ Locate(file_path="foo.py", find="x = <|>value", marker="<|>")
7075

7176
## Position Resolution Rules
7277

73-
| Scope | Find | Resolution Result |
74-
|-------|------|---------|
75-
| `SymbolScope` | `None` | Position of the symbol's declared name |
76-
| `SymbolScope` | With `<HERE>` | Marked position within the symbol body |
78+
| Scope | Find | Resolution Result |
79+
| ------------- | ---------------- | -------------------------------------------- |
80+
| `SymbolScope` | `None` | Position of the symbol's declared name |
81+
| `SymbolScope` | With `<HERE>` | Marked position within the symbol body |
7782
| `SymbolScope` | Without `<HERE>` | Start of matched text within the symbol body |
78-
| `LineScope` | `None` | First non-whitespace character of the line |
79-
| `LineScope` | With `<HERE>` | Marked position within the line |
80-
| `LineScope` | Without `<HERE>` | Start of matched text within the line |
81-
| `None` | With `<HERE>` | Global search, marked position |
82-
| `None` | Without `<HERE>` | Global search, start of matched text |
83-
| `None` | `None` | ❌ Invalid, validation failure |
83+
| `LineScope` | `None` | First non-whitespace character of the line |
84+
| `LineScope` | With `<HERE>` | Marked position within the line |
85+
| `LineScope` | Without `<HERE>` | Start of matched text within the line |
86+
| `None` | With `<HERE>` | Global search, marked position |
87+
| `None` | Without `<HERE>` | Global search, start of matched text |
88+
| `None` | `None` | ❌ Invalid, validation should failure |
89+
90+
## Whitespace Handling
91+
92+
To balance flexibility and precision, the matching engine uses a **token-aware** whitespace strategy rather than exact string matching or full fuzzy matching.
93+
94+
### Tokenization Strategy
95+
96+
The search pattern is first tokenized into identifiers, operators, and explicit whitespace. The matching then follows these rules:
97+
98+
1. **Identifiers remain atomic**: Spaces are never allowed within an identifier (e.g., `int` will not match `i n t`).
99+
2. **Flexible operator spacing**: Zero or more whitespace characters (`\s*`) are allowed between identifiers and operators, or between operators.
100+
3. **Mandatory explicit whitespace**: If the search pattern contains explicit whitespace, the source must contain at least one whitespace character (`\s+`) at that position.
101+
102+
### Behavior Examples
103+
104+
| Input | Matching Logic | Matches | Rejects |
105+
| ----------- | ------------------------------------------ | ------------------------- | --------- |
106+
| `int a` | Requires space between tokens | `int a`, `int a` | `inta` |
107+
| `a+b` | Allows flexible spacing around operators | `a+b`, `a + b` | `ab` |
108+
| `foo.bar` | Allows flexible spacing around dot | `foo.bar`, `foo . bar` | `foobar` |
109+
| `foo(x, y)` | Allows flexible spacing; preserves comma | `foo(x, y)`, `foo( x,y )` | `foo(xy)` |
110+
111+
### Empty Find Pattern
112+
113+
An empty `find` pattern (or whitespace-only) with a marker returns:
114+
- Offset 0 if both before and after segments are empty.
115+
- Otherwise, it is treated as a mandatory whitespace pattern (requiring at least one whitespace character).
116+
117+
### Design Rationale
118+
119+
#### Why Not Exact String Matching?
120+
Code formatting varies across teams and tools. Exact matching would break on variations in indentation (spaces vs tabs), spacing around operators, or line continuation differences.
121+
122+
#### Why Not Full Fuzzy Matching?
123+
Overly permissive matching creates ambiguity. For example, `int a` matching `inta` changes semantic meaning, and cross-line matches can accidentally hit unintended code structures.
124+
125+
#### Why Token-Based?
126+
Token boundaries align with programming language semantics. It preserves identifier integrity while allowing natural operator spacing variations, matching the developer's mental model of "what should match".
84127

85128
## LSP Capability Mapping
86129

87130
### Capabilities Requiring Position
88131

89-
| LSP Capability | Positioning Need | Locate Usage |
90-
|---------|---------|------------|
91-
| `textDocument/definition` | Identifier position | `SymbolScope` or `find="<HERE>identifier"` |
92-
| `textDocument/references` | Symbol declaration position | `SymbolScope(symbol_path=[...])` |
93-
| `textDocument/rename` | Symbol declaration position | `SymbolScope(symbol_path=[...])` |
94-
| `textDocument/hover` | Any identifier | `find="<HERE>target"` |
95-
| `textDocument/completion` | Trigger point | `find="obj.<HERE>"` |
96-
| `textDocument/signatureHelp` | Inside parentheses | `find="func(<HERE>"` |
132+
| LSP Capability | Positioning Need | Locate Usage |
133+
| ---------------------------- | --------------------------- | ------------------------------------------ |
134+
| `textDocument/definition` | Identifier position | `SymbolScope` or `find="<HERE>identifier"` |
135+
| `textDocument/references` | Symbol declaration position | `SymbolScope(symbol_path=[...])` |
136+
| `textDocument/rename` | Symbol declaration position | `SymbolScope(symbol_path=[...])` |
137+
| `textDocument/hover` | Any identifier | `find="<HERE>target"` |
138+
| `textDocument/completion` | Trigger point | `find="obj.<HERE>"` |
139+
| `textDocument/signatureHelp` | Inside parentheses | `find="func(<HERE>"` |
97140

98141
### Capabilities Requiring Range
99142

100-
| LSP Capability | Positioning Need | LocateRange Usage |
101-
|---------|---------|-----------------|
102-
| `textDocument/codeAction` | Selected code range | `SymbolScope` or `find="selected text"` |
103-
| `textDocument/formatting` | Formatting range | `SymbolScope` to select the entire symbol |
143+
| LSP Capability | Positioning Need | LocateRange Usage |
144+
| ------------------------- | ------------------- | ----------------------------------------- |
145+
| `textDocument/codeAction` | Selected code range | `SymbolScope` or `find="selected text"` |
146+
| `textDocument/formatting` | Formatting range | `SymbolScope` to select the entire symbol |
104147

105148
## Usage Examples
106149

@@ -193,6 +236,7 @@ Locate(file_path="mod.py", scope=SymbolScope(symbol_path=["MyClass"]))
193236
### Q4: Why a separate LocateRange?
194237

195238
`Position` and `Range` are distinct concepts:
239+
196240
- `Position`: A single point, used for hover, definition, completion.
197241
- `Range`: An interval, used for codeAction, formatting.
198242

@@ -211,8 +255,7 @@ This keeps the marker-based positioning intuitive without requiring the Agent to
211255

212256
### Q7: Text matching rules in Find?
213257

214-
- Matching is **literal**, not regular expression.
215-
- Matching **ignores whitespace differences** (to be confirmed in implementation).
258+
- Matching is **literal** (not regex), but with intelligent whitespace handling (see [Whitespace Handling](#whitespace-handling)).
216259
- Returns the **first match** within the Scope.
217260
- If multiple matches exist, the Agent can provide more context in `find` to disambiguate.
218261

File renamed without changes.

python/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ requires = ["uv_build>=0.9.9,<0.10.0"]
2020
build-backend = "uv_build"
2121

2222
[tool.uv.sources]
23-
lsap-schema = { git = "https://github.com/lsp-client/LSAP.git", subdirectory="schema" }
23+
lsap-schema = { git = "https://github.com/lsp-client/LSAP.git" }
2424
lsp-client = { git = "https://github.com/lsp-client/python-sdk.git" }
2525

2626
[dependency-groups]

schema/.gitignore

Lines changed: 0 additions & 10 deletions
This file was deleted.

schema/README.md

Whitespace-only changes.

0 commit comments

Comments
 (0)