Skip to content

Commit 605d2ae

Browse files
committed
Add navigation sidebar, UAT framework, and reliability features
Major additions: - Navigation sidebar widget with collapsible state and keyboard shortcuts - AI-driven UAT testing framework with scenario support - Retry utilities with exponential backoff - AI timeout handling for reliable responses OpenSpec changes: - Archive completed changes (ai-driven-uat, commitment-guardrails, reliability-compliance) - Add navigation-sidebar change proposal - Add ai-uat spec Test improvements: - Update TUI tests to use create_test_app_for_screen pattern - Add nav sidebar integration tests - Add UAT scenario tests - Update snapshots for new UI layout Code quality: - Fix type annotations across auth, db, and widget modules - Update widget exports in __init__.py - Improve test fixtures and conftest organization
1 parent 7417847 commit 605d2ae

File tree

112 files changed

+9529
-865
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

112 files changed

+9529
-865
lines changed

AGENTS.md

Lines changed: 79 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,5 +170,83 @@ src/jdo/
170170
tests/
171171
├── unit/ # Fast isolated tests
172172
├── integration/ # Database tests
173-
└── tui/ # Textual Pilot tests
173+
├── tui/ # Textual Pilot tests
174+
└── uat/ # AI-driven UAT tests
175+
```
176+
177+
## AI-Driven UAT Testing
178+
179+
The `tests/uat/` directory contains AI-driven User Acceptance Testing infrastructure.
180+
181+
### Running UAT Tests
182+
183+
```bash
184+
# Run all UAT tests with mock AI (fast, free)
185+
uv run pytest tests/uat/ -v
186+
187+
# Run only live AI tests (requires credentials)
188+
uv run pytest tests/uat/ -v -m live_ai
189+
190+
# Skip live AI tests
191+
uv run pytest tests/uat/ -v -m "not live_ai"
192+
```
193+
194+
### UAT Components
195+
196+
| Component | Location | Purpose |
197+
|-----------|----------|---------|
198+
| `models.py` | tests/uat/ | Pydantic models for actions, scenarios, results |
199+
| `observer.py` | tests/uat/ | Captures UI state for AI consumption |
200+
| `driver.py` | tests/uat/ | Orchestrates AI-driven test execution |
201+
| `loader.py` | tests/uat/ | Loads scenarios from YAML files |
202+
| `mocks.py` | tests/uat/ | Mock AI responses for deterministic tests |
203+
| `scenarios/` | tests/uat/ | YAML scenario definitions |
204+
205+
### Writing New Scenarios
206+
207+
Create a YAML file in `tests/uat/scenarios/`:
208+
209+
```yaml
210+
name: my_scenario
211+
description: What this scenario tests
212+
goal: |
213+
Natural language description of what the AI should accomplish.
214+
Be specific about the expected end state.
215+
216+
preconditions:
217+
- press:n # Navigate to chat first
218+
219+
success_criteria:
220+
- screen:HomeScreen # Must end on home screen
221+
- no_errors # No step failures
222+
- completed # AI signaled "done"
223+
224+
max_steps: 30
225+
timeout_seconds: 90
226+
227+
tags:
228+
- smoke
229+
- my_feature
230+
```
231+
232+
### Adding Mock Responses
233+
234+
For deterministic CI tests, add a mock in `tests/uat/mocks.py`:
235+
236+
```python
237+
def create_my_scenario_mock() -> FunctionModel:
238+
step = 0
239+
def model_fn(messages, info):
240+
nonlocal step
241+
step += 1
242+
actions = [
243+
UATAction(action_type=ActionType.PRESS, target="n", reason="..."),
244+
UATAction(action_type=ActionType.DONE, reason="..."),
245+
]
246+
action = actions[min(step - 1, len(actions) - 1)]
247+
return ModelResponse(parts=[TextPart(content=action.model_dump_json())])
248+
return FunctionModel(model_fn)
249+
250+
# Add to SCENARIO_MOCKS dict
251+
SCENARIO_MOCKS["my_scenario"] = create_my_scenario_mock
174252
```

ROADMAP.yaml

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,9 @@ metadata:
3838
# - wire-ai-to-chat: PydanticAI agent connected to ChatScreen (2025-12-19)
3939
# - persist-handler-results: Command handlers persist to database (2025-12-19)
4040
# - fix-navigation-and-review-textual: All navigation shortcuts working (2025-12-19)
41+
# - add-commitment-guardrails: Overcommitment warnings in /commit flow (2025-12-19)
4142
#
42-
# TEST STATUS: 1,268 passed, 11 snapshots
43+
# TEST STATUS: 1,279 passed, 11 snapshots (+13 tests from commitment-guardrails)
4344
# LINT STATUS: All checks passed (ruff, pyrefly)
4445

4546
features:
@@ -295,33 +296,31 @@ features:
295296
title: "Make Fewer Promises Guardrails"
296297
priority: high
297298
complexity: low
298-
status: proposed
299+
status: completed
299300
category: integrity
300-
phase: next
301+
phase: now
302+
change_spec: "add-commitment-guardrails"
301303

302304
description: |
303-
MPI principle: "make fewer, keep them all." Currently nothing prevents
304-
commitment overload. Add guardrails that nudge toward quality over quantity.
305+
MPI principle: "make fewer, keep them all." Tracks commitment velocity
306+
(created vs completed per week) to warn users when they're overcommitting.
307+
No ceiling on active commitments - focuses on sustainable velocity instead.
305308
306309
proposed_changes:
307-
overcommitment_warning:
308-
description: "Warn when user has too many active commitments"
309-
threshold: "Configurable (default: 7 active commitments)"
310-
message: "You have 8 active commitments. Are you sure you want to add another?"
311-
312310
commitment_velocity:
313311
description: "Track commitments made vs completed per week"
314-
alert: "You're making commitments faster than completing them"
312+
alert: "You're creating commitments faster than completing them"
313+
window: "7-day rolling window"
315314

316315
ai_coaching:
317316
prompts:
318-
- "Before adding: 'What would you need to drop to take this on?'"
319-
- "When overloaded: 'Which of these could be renegotiated?'"
317+
- "When velocity is high: 'You've created X but only completed Y. Are you overcommitting?'"
318+
- "Coaching tone, not blocking - user can always proceed"
320319

321320
acceptance_criteria:
322-
- "Warning appears when adding commitment above threshold"
323-
- "User can override warning but it's logged"
324-
- "Velocity metrics visible in integrity dashboard"
321+
- "Velocity warning appears when created > completed in past 7 days"
322+
- "User can proceed despite warning (autonomy preserved)"
323+
- "Graceful degradation if database queries fail"
325324

326325
complete_recurring_tui:
327326
title: "Complete Recurring Commitments TUI"

0 commit comments

Comments
 (0)