Skip to content

Commit 5538fb5

Browse files
Ubuntuclaude
andcommitted
docs(discoveries): User testing validates mandatory testing requirement
Issue #1783 user testing found critical bug (SubIssue not hashable) that 110 passing unit tests missed. Validates USER_PREFERENCES.md requirement. Bug would've shipped to production without user testing. Fixed in PR #1784. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent a14275a commit 5538fb5

File tree

1 file changed

+125
-0
lines changed

1 file changed

+125
-0
lines changed

.claude/context/DISCOVERIES.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ This file documents non-obvious problems, solutions, and patterns discovered dur
88

99
### Recent (December 2025)
1010

11+
- [Mandatory User Testing Validates Its Own Value](#mandatory-user-testing-validates-value-2025-12-02)
1112
- [System Metadata vs User Content in Git Conflict Detection](#system-metadata-vs-user-content-git-conflict-2025-12-01)
1213

1314
### November 2025
@@ -1787,3 +1788,127 @@ Opus V2:
17871788
**Pattern Identified**: Validation checkpoints can backfire - use flow language instead of interruption language
17881789

17891790
**Lesson**: Always validate AI guidance changes empirically with ALL target models before deploying
1791+
1792+
---
1793+
1794+
## Mandatory User Testing Validates Its Own Value {#mandatory-user-testing-validates-value-2025-12-02}
1795+
1796+
**Date**: 2025-12-02
1797+
**Context**: Implementing Parallel Task Orchestrator (Issue #1783, PR #1784)
1798+
**Impact**: HIGH - Validates mandatory testing requirement, found production-blocking bug
1799+
1800+
### Problem
1801+
1802+
Unit tests can achieve high coverage (86%) and 100% pass rate while missing critical real-world bugs.
1803+
1804+
### Discovery
1805+
1806+
Mandatory user testing (USER_PREFERENCES.md requirement) caught a **production-blocking bug** that 110 passing unit tests missed:
1807+
1808+
**Bug**: `SubIssue` dataclass not hashable, but `OrchestrationConfig` uses `set()` for deduplication
1809+
1810+
```python
1811+
# This passed all unit tests but fails in real usage:
1812+
config = OrchestrationConfig(sub_issues=[...])
1813+
# TypeError: unhashable type: 'SubIssue'
1814+
```
1815+
1816+
### How It Was Missed
1817+
1818+
**Unit Tests** (110/110 passing):
1819+
- Mocked all `SubIssue` creation
1820+
- Never tested real deduplication path
1821+
- Assumed API worked without instantiation
1822+
1823+
**User Testing** (mandatory requirement):
1824+
- Tried actual config creation
1825+
- **Bug discovered in <2 minutes**
1826+
- Immediate TypeError on first real use
1827+
1828+
### Fix
1829+
1830+
```python
1831+
# Before
1832+
@dataclass
1833+
class SubIssue:
1834+
labels: List[str] = field(default_factory=list)
1835+
1836+
# After
1837+
@dataclass(frozen=True)
1838+
class SubIssue:
1839+
labels: tuple = field(default_factory=tuple)
1840+
```
1841+
1842+
### Validation
1843+
1844+
**Test Results After Fix**:
1845+
```
1846+
✅ Config creation works
1847+
✅ Deduplication works (3 items → 2 unique)
1848+
✅ Orchestrator instantiation works
1849+
✅ Status API functional
1850+
```
1851+
1852+
### Key Insights
1853+
1854+
1. **High test coverage ≠ Real-world readiness**
1855+
- 86% coverage, 110/110 tests, still had production blocker
1856+
- Mocks hide integration issues
1857+
1858+
2. **User testing finds different bugs**
1859+
- Unit tests validate component logic
1860+
- User tests validate actual workflows
1861+
- Both are necessary
1862+
1863+
3. **Mandatory requirement justified**
1864+
- Without user testing, would've shipped broken code
1865+
- CI wouldn't catch this (unit tests pass)
1866+
- First user would've hit TypeError
1867+
1868+
4. **Time investment worthwhile**
1869+
- <5 minutes of user testing
1870+
- Found bug that could've cost hours of debugging
1871+
- Prevented embarrassing production failure
1872+
1873+
### Implementation
1874+
1875+
**Mandatory User Testing Pattern**:
1876+
```bash
1877+
# Test like a user would
1878+
python -c "from module import Class; obj = Class(...)" # Real instantiation
1879+
config = RealConfig(real_data) # No mocks
1880+
result = api.actual_method() # Real workflow
1881+
```
1882+
1883+
**NOT sufficient**:
1884+
```python
1885+
# Unit test approach (can miss real issues)
1886+
@patch("module.Class")
1887+
def test_with_mock(mock_class): # Never tests real instantiation
1888+
...
1889+
```
1890+
1891+
### Lessons Learned
1892+
1893+
1. **Always test like a user** - No mocks, real instantiation, actual workflows
1894+
2. **High coverage isn't enough** - Need real usage validation
1895+
3. **Mocks hide bugs** - Integration issues invisible to mocked tests
1896+
4. **User requirements are wise** - This explicit requirement saved us from shipping broken code
1897+
1898+
### Related
1899+
1900+
- Issue #1783: Parallel Task Orchestrator
1901+
- PR #1784: Implementation
1902+
- USER_PREFERENCES.md: Mandatory E2E testing requirement
1903+
- Commit dc90b350: Hashability fix
1904+
1905+
### Recommendation
1906+
1907+
**ENFORCE mandatory user testing** for ALL features:
1908+
- Test with `uvx --from git+...` (no local state)
1909+
- Try actual user workflows (no mocks)
1910+
- Verify error messages and UX
1911+
- Document test results in PR
1912+
1913+
This discovery **validates the user's explicit requirement** - mandatory user testing prevents production failures that unit tests miss.
1914+

0 commit comments

Comments
 (0)