Skip to content

Commit 8303ed1

Browse files
Merge remote-tracking branch 'origin/main' into feat/telemetry
2 parents e5039c8 + b7411e2 commit 8303ed1

File tree

72 files changed

+10458
-1316
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

72 files changed

+10458
-1316
lines changed
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Unity NL Editing Suite — Natural Mode
2+
3+
You are running inside CI for the **unity-mcp** repository. Your task is to demonstrate end‑to‑end **natural‑language code editing** on a representative Unity C# script using whatever capabilities and servers are already available in this session. Work autonomously. Do not ask the user for input. Do NOT spawn subagents, as they will not have access to the mcp server process on the top-level agent.
4+
5+
## Mission
6+
1) **Discover capabilities.** Quietly inspect the tools and any connected servers that are available to you at session start. If the server offers a primer or capabilities resource, read it before acting.
7+
2) **Choose a target file.** Prefer `TestProjects/UnityMCPTests/Assets/Scripts/LongUnityScriptClaudeTest.cs` if it exists; otherwise choose a simple, safe C# script under `TestProjects/UnityMCPTests/Assets/`.
8+
3) **Perform a small set of realistic edits** using minimal, precise changes (not full-file rewrites). Examples of small edits you may choose from (pick 3–6 total):
9+
- Insert a new, small helper method (e.g., a logger or counter) in a sensible location.
10+
- Add a short anchor comment near a key method (e.g., above `Update()`), then add or modify a few lines nearby.
11+
- Append an end‑of‑class utility method (e.g., formatting or clamping helper).
12+
- Make a safe, localized tweak to an existing method body (e.g., add a guard or a simple accumulator).
13+
- Optionally include one idempotency/no‑op check (re‑apply an edit and confirm nothing breaks).
14+
4) **Validate your edits.** Re‑read the modified regions and verify the changes exist, compile‑risk is low, and surrounding structure remains intact.
15+
5) **Report results.** Produce both:
16+
- A JUnit XML at `reports/junit-nl-suite.xml` containing a single suite named `UnityMCP.NL` with one test case per sub‑test you executed (mark pass/fail and include helpful failure text).
17+
- A summary markdown at `reports/junit-nl-suite.md` that explains what you attempted, what succeeded/failed, and any follow‑ups you would try.
18+
6) **Be gentle and reversible.** Prefer targeted, minimal edits; avoid wide refactors or non‑deterministic changes.
19+
20+
## Assumptions & Hints (non‑prescriptive)
21+
- A Unity‑oriented MCP server is expected to be connected. If a server‑provided **primer/capabilities** resource exists, read it first. If no primer is available, infer capabilities from your visible tools in the session.
22+
- In CI/headless mode, when calling `mcp__unity__list_resources` or `mcp__unity__read_resource`, include:
23+
- `ctx: {}`
24+
- `project_root: "TestProjects/UnityMCPTests"` (the server will also accept the absolute path passed via env)
25+
Example: `{ "ctx": {}, "under": "Assets/Scripts", "pattern": "*.cs", "project_root": "TestProjects/UnityMCPTests" }`
26+
- If the preferred file isn’t present, locate a fallback C# file with simple, local methods you can edit safely.
27+
- If a compile command is available in this environment, you may optionally trigger it; if not, rely on structural checks and localized validation.
28+
29+
## Output Requirements (match NL suite conventions)
30+
- JUnit XML at `$JUNIT_OUT` if set, otherwise `reports/junit-nl-suite.xml`.
31+
- Single suite named `UnityMCP.NL`, one `<testcase>` per sub‑test; include `<failure>` on errors.
32+
- Markdown at `$MD_OUT` if set, otherwise `reports/junit-nl-suite.md`.
33+
34+
Constraints (for fast publishing):
35+
- Log allowed tools once as a single line: `AllowedTools: ...`.
36+
- For every edit: Read → Write (with precondition hash) → Re‑read; on `{status:"stale_file"}` retry once after re‑read.
37+
- Keep evidence to ±20–40 lines windows; cap unified diffs to 300 lines and note truncation.
38+
- End `<system-out>` with `VERDICT: PASS` or `VERDICT: FAIL`.
39+
40+
## Guardrails
41+
- No destructive operations. Keep changes minimal and well‑scoped.
42+
- Don’t leak secrets or environment details beyond what’s needed in the reports.
43+
- Work without user interaction; do not prompt for approval mid‑flow.
44+
45+
> If capabilities discovery fails, still produce the two reports that clearly explain why you could not proceed and what evidence you gathered.
Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
# Unity NL/T Editing Suite — Additive Test Design
2+
3+
You are running inside CI for the `unity-mcp` repo. Use only the tools allowed by the workflow. Work autonomously; do not prompt the user. Do NOT spawn subagents.
4+
5+
**Print this once, verbatim, early in the run:**
6+
AllowedTools: Write,mcp__unity__manage_editor,mcp__unity__list_resources,mcp__unity__read_resource,mcp__unity__apply_text_edits,mcp__unity__script_apply_edits,mcp__unity__validate_script,mcp__unity__find_in_file,mcp__unity__read_console,mcp__unity__get_sha
7+
8+
---
9+
10+
## Mission
11+
1) Pick target file (prefer):
12+
- `unity://path/Assets/Scripts/LongUnityScriptClaudeTest.cs`
13+
2) Execute **all** NL/T tests in order using minimal, precise edits that **build on each other**.
14+
3) Validate each edit with `mcp__unity__validate_script(level:"standard")`.
15+
4) **Report**: write one `<testcase>` XML fragment per test to `reports/<TESTID>_results.xml`. Do **not** read or edit `$JUNIT_OUT`.
16+
5) **NO RESTORATION** - tests build additively on previous state.
17+
18+
---
19+
20+
## Environment & Paths (CI)
21+
- Always pass: `project_root: "TestProjects/UnityMCPTests"` and `ctx: {}` on list/read/edit/validate.
22+
- **Canonical URIs only**:
23+
- Primary: `unity://path/Assets/...` (never embed `project_root` in the URI)
24+
- Relative (when supported): `Assets/...`
25+
26+
CI provides:
27+
- `$JUNIT_OUT=reports/junit-nl-suite.xml` (pre‑created; leave alone)
28+
- `$MD_OUT=reports/junit-nl-suite.md` (synthesized from JUnit)
29+
30+
---
31+
32+
## Tool Mapping
33+
- **Anchors/regex/structured**: `mcp__unity__script_apply_edits`
34+
- Allowed ops: `anchor_insert`, `replace_method`, `insert_method`, `delete_method`, `regex_replace`
35+
- **Precise ranges / atomic batch**: `mcp__unity__apply_text_edits` (non‑overlapping ranges)
36+
- **Hash-only**: `mcp__unity__get_sha` — returns `{sha256,lengthBytes,lastModifiedUtc}` without file body
37+
- **Validation**: `mcp__unity__validate_script(level:"standard")`
38+
- **Dynamic targeting**: Use `mcp__unity__find_in_file` to locate current positions of methods/markers
39+
40+
---
41+
42+
## Additive Test Design Principles
43+
44+
**Key Changes from Reset-Based:**
45+
1. **Dynamic Targeting**: Use `find_in_file` to locate methods/content, never hardcode line numbers
46+
2. **State Awareness**: Each test expects the file state left by the previous test
47+
3. **Content-Based Operations**: Target methods by signature, classes by name, not coordinates
48+
4. **Cumulative Validation**: Ensure the file remains structurally sound throughout the sequence
49+
5. **Composability**: Tests demonstrate how operations work together in real workflows
50+
51+
**State Tracking:**
52+
- Track file SHA after each test to ensure operations succeeded
53+
- Use content signatures (method names, comment markers) to verify expected state
54+
- Validate structural integrity after each major change
55+
56+
---
57+
58+
## Execution Order & Additive Test Specs
59+
60+
### NL-0. Baseline State Capture
61+
**Goal**: Establish initial file state and verify accessibility
62+
**Actions**:
63+
- Read file head and tail to confirm structure
64+
- Locate key methods: `HasTarget()`, `GetCurrentTarget()`, `Update()`, `ApplyBlend()`
65+
- Record initial SHA for tracking
66+
- **Expected final state**: Unchanged baseline file
67+
68+
### NL-1. Core Method Operations (Additive State A)
69+
**Goal**: Demonstrate method replacement operations
70+
**Actions**:
71+
- Replace `HasTarget()` method body: `public bool HasTarget() { return currentTarget != null; }`
72+
- Insert `PrintSeries()` method after `GetCurrentTarget()`: `public void PrintSeries() { Debug.Log("1,2,3"); }`
73+
- Verify both methods exist and are properly formatted
74+
- Delete `PrintSeries()` method (cleanup for next test)
75+
- **Expected final state**: `HasTarget()` modified, file structure intact, no temporary methods
76+
77+
### NL-2. Anchor Comment Insertion (Additive State B)
78+
**Goal**: Demonstrate anchor-based insertions above methods
79+
**Actions**:
80+
- Use `find_in_file` to locate current position of `Update()` method
81+
- Insert `// Build marker OK` comment line above `Update()` method
82+
- Verify comment exists and `Update()` still functions
83+
- **Expected final state**: State A + build marker comment above `Update()`
84+
85+
### NL-3. End-of-Class Content (Additive State C)
86+
**Goal**: Demonstrate end-of-class insertions with smart brace matching
87+
**Actions**:
88+
- Use anchor pattern to find the class-ending brace (accounts for previous additions)
89+
- Insert three comment lines before final class brace:
90+
```
91+
// Tail test A
92+
// Tail test B
93+
// Tail test C
94+
```
95+
- **Expected final state**: State B + tail comments before class closing brace
96+
97+
### NL-4. Console State Verification (No State Change)
98+
**Goal**: Verify Unity console integration without file modification
99+
**Actions**:
100+
- Read Unity console messages (INFO level)
101+
- Validate no compilation errors from previous operations
102+
- **Expected final state**: State C (unchanged)
103+
104+
### T-A. Temporary Helper Lifecycle (Returns to State C)
105+
**Goal**: Test insert → verify → delete cycle for temporary code
106+
**Actions**:
107+
- Find current position of `GetCurrentTarget()` method (may have shifted from NL-2 comment)
108+
- Insert temporary helper: `private int __TempHelper(int a, int b) => a + b;`
109+
- Verify helper method exists and compiles
110+
- Delete helper method via structured delete operation
111+
- **Expected final state**: Return to State C (helper removed, other changes intact)
112+
113+
### T-B. Method Body Interior Edit (Additive State D)
114+
**Goal**: Edit method interior without affecting structure, on modified file
115+
**Actions**:
116+
- Use `find_in_file` to locate current `HasTarget()` method (modified in NL-1)
117+
- Edit method body interior: change return statement to `return true; /* test modification */`
118+
- Use `validate: "relaxed"` for interior-only edit
119+
- Verify edit succeeded and file remains balanced
120+
- **Expected final state**: State C + modified HasTarget() body
121+
122+
### T-C. Different Method Interior Edit (Additive State E)
123+
**Goal**: Edit a different method to show operations don't interfere
124+
**Actions**:
125+
- Locate `ApplyBlend()` method using content search
126+
- Edit interior line to add null check: `if (animator == null) return; // safety check`
127+
- Preserve method signature and structure
128+
- **Expected final state**: State D + modified ApplyBlend() method
129+
130+
### T-D. End-of-Class Helper (Additive State F)
131+
**Goal**: Add permanent helper method at class end
132+
**Actions**:
133+
- Use smart anchor matching to find current class-ending brace (after NL-3 tail comments)
134+
- Insert permanent helper before class brace: `private void TestHelper() { /* placeholder */ }`
135+
- **Expected final state**: State E + TestHelper() method before class end
136+
137+
### T-E. Method Evolution Lifecycle (Additive State G)
138+
**Goal**: Insert → modify → finalize a method through multiple operations
139+
**Actions**:
140+
- Insert basic method: `private int Counter = 0;`
141+
- Update it: find and replace with `private int Counter = 42; // initialized`
142+
- Add companion method: `private void IncrementCounter() { Counter++; }`
143+
- **Expected final state**: State F + Counter field + IncrementCounter() method
144+
145+
### T-F. Atomic Multi-Edit (Additive State H)
146+
**Goal**: Multiple coordinated edits in single atomic operation
147+
**Actions**:
148+
- Read current file state to compute precise ranges
149+
- Atomic edit combining:
150+
1. Add comment in `HasTarget()`: `// validated access`
151+
2. Add comment in `ApplyBlend()`: `// safe animation`
152+
3. Add final class comment: `// end of test modifications`
153+
- All edits computed from same file snapshot, applied atomically
154+
- **Expected final state**: State G + three coordinated comments
155+
156+
### T-G. Path Normalization Test (No State Change)
157+
**Goal**: Verify URI forms work equivalently on modified file
158+
**Actions**:
159+
- Make identical edit using `unity://path/Assets/Scripts/LongUnityScriptClaudeTest.cs`
160+
- Then using `Assets/Scripts/LongUnityScriptClaudeTest.cs`
161+
- Second should return `stale_file`, retry with updated SHA
162+
- Verify both URI forms target same file
163+
- **Expected final state**: State H (no content change, just path testing)
164+
165+
### T-H. Validation on Modified File (No State Change)
166+
**Goal**: Ensure validation works correctly on heavily modified file
167+
**Actions**:
168+
- Run `validate_script(level:"standard")` on current state
169+
- Verify no structural errors despite extensive modifications
170+
- **Expected final state**: State H (validation only, no edits)
171+
172+
### T-I. Failure Surface Testing (No State Change)
173+
**Goal**: Test error handling on real modified file
174+
**Actions**:
175+
- Attempt overlapping edits (should fail cleanly)
176+
- Attempt edit with stale SHA (should fail cleanly)
177+
- Verify error responses are informative
178+
- **Expected final state**: State H (failed operations don't modify file)
179+
180+
### T-J. Idempotency on Modified File (Additive State I)
181+
**Goal**: Verify operations behave predictably when repeated
182+
**Actions**:
183+
- Add unique marker comment: `// idempotency test marker`
184+
- Attempt to add same comment again (should detect no-op)
185+
- Remove marker, attempt removal again (should handle gracefully)
186+
- **Expected final state**: State H + verified idempotent behavior
187+
188+
---
189+
190+
## Dynamic Targeting Examples
191+
192+
**Instead of hardcoded coordinates:**
193+
```json
194+
{"startLine": 31, "startCol": 26, "endLine": 31, "endCol": 58}
195+
```
196+
197+
**Use content-aware targeting:**
198+
```json
199+
# Find current method location
200+
find_in_file(pattern: "public bool HasTarget\\(\\)")
201+
# Then compute edit ranges from found position
202+
```
203+
204+
**Method targeting by signature:**
205+
```json
206+
{"op": "replace_method", "className": "LongUnityScriptClaudeTest", "methodName": "HasTarget"}
207+
```
208+
209+
**Anchor-based insertions:**
210+
```json
211+
{"op": "anchor_insert", "anchor": "private void Update\\(\\)", "position": "before", "text": "// comment"}
212+
```
213+
214+
---
215+
216+
## State Verification Patterns
217+
218+
**After each test:**
219+
1. Verify expected content exists: `find_in_file` for key markers
220+
2. Check structural integrity: `validate_script(level:"standard")`
221+
3. Update SHA tracking for next test's preconditions
222+
4. Log cumulative changes in test evidence
223+
224+
**Error Recovery:**
225+
- If test fails, log current state but continue (don't restore)
226+
- Next test adapts to actual current state, not expected state
227+
- Demonstrates resilience of operations on varied file conditions
228+
229+
---
230+
231+
## Benefits of Additive Design
232+
233+
1. **Realistic Workflows**: Tests mirror actual development patterns
234+
2. **Robust Operations**: Proves edits work on evolving files, not just pristine baselines
235+
3. **Composability Validation**: Shows operations coordinate well together
236+
4. **Simplified Infrastructure**: No restore scripts or snapshots needed
237+
5. **Better Failure Analysis**: Failures don't cascade - each test adapts to current reality
238+
6. **State Evolution Testing**: Validates SDK handles cumulative file modifications correctly
239+
240+
This additive approach produces a more realistic and maintainable test suite that better represents actual SDK usage patterns.

0 commit comments

Comments
 (0)