@@ -8,6 +8,7 @@ This file documents non-obvious problems, solutions, and patterns discovered dur
88
99### Recent (December 2025)
1010
11+ - [ Mandatory User Testing Validates Its Own Value] ( #mandatory-user-testing-validates-value-2025-12-02 )
1112- [ System Metadata vs User Content in Git Conflict Detection] ( #system-metadata-vs-user-content-git-conflict-2025-12-01 )
1213
1314### November 2025
@@ -1787,3 +1788,127 @@ Opus V2:
17871788** Pattern Identified** : Validation checkpoints can backfire - use flow language instead of interruption language
17881789
17891790** Lesson** : Always validate AI guidance changes empirically with ALL target models before deploying
1791+
1792+ ---
1793+
1794+ ## Mandatory User Testing Validates Its Own Value {#mandatory-user-testing-validates-value-2025-12-02}
1795+
1796+ ** Date** : 2025-12-02
1797+ ** Context** : Implementing Parallel Task Orchestrator (Issue #1783 , PR #1784 )
1798+ ** Impact** : HIGH - Validates mandatory testing requirement, found production-blocking bug
1799+
1800+ ### Problem
1801+
1802+ Unit tests can achieve high coverage (86%) and 100% pass rate while missing critical real-world bugs.
1803+
1804+ ### Discovery
1805+
1806+ Mandatory user testing (USER_PREFERENCES.md requirement) caught a ** production-blocking bug** that 110 passing unit tests missed:
1807+
1808+ ** Bug** : ` SubIssue ` dataclass not hashable, but ` OrchestrationConfig ` uses ` set() ` for deduplication
1809+
1810+ ``` python
1811+ # This passed all unit tests but fails in real usage:
1812+ config = OrchestrationConfig(sub_issues = [... ])
1813+ # TypeError: unhashable type: 'SubIssue'
1814+ ```
1815+
1816+ ### How It Was Missed
1817+
1818+ ** Unit Tests** (110/110 passing):
1819+ - Mocked all ` SubIssue ` creation
1820+ - Never tested real deduplication path
1821+ - Assumed API worked without instantiation
1822+
1823+ ** User Testing** (mandatory requirement):
1824+ - Tried actual config creation
1825+ - ** Bug discovered in <2 minutes**
1826+ - Immediate TypeError on first real use
1827+
1828+ ### Fix
1829+
1830+ ``` python
1831+ # Before
1832+ @dataclass
1833+ class SubIssue :
1834+ labels: List[str ] = field(default_factory = list )
1835+
1836+ # After
1837+ @dataclass (frozen = True )
1838+ class SubIssue :
1839+ labels: tuple = field(default_factory = tuple )
1840+ ```
1841+
1842+ ### Validation
1843+
1844+ ** Test Results After Fix** :
1845+ ```
1846+ ✅ Config creation works
1847+ ✅ Deduplication works (3 items → 2 unique)
1848+ ✅ Orchestrator instantiation works
1849+ ✅ Status API functional
1850+ ```
1851+
1852+ ### Key Insights
1853+
1854+ 1 . ** High test coverage ≠ Real-world readiness**
1855+ - 86% coverage, 110/110 tests, still had production blocker
1856+ - Mocks hide integration issues
1857+
1858+ 2 . ** User testing finds different bugs**
1859+ - Unit tests validate component logic
1860+ - User tests validate actual workflows
1861+ - Both are necessary
1862+
1863+ 3 . ** Mandatory requirement justified**
1864+ - Without user testing, would've shipped broken code
1865+ - CI wouldn't catch this (unit tests pass)
1866+ - First user would've hit TypeError
1867+
1868+ 4 . ** Time investment worthwhile**
1869+ - <5 minutes of user testing
1870+ - Found bug that could've cost hours of debugging
1871+ - Prevented embarrassing production failure
1872+
1873+ ### Implementation
1874+
1875+ ** Mandatory User Testing Pattern** :
1876+ ``` bash
1877+ # Test like a user would
1878+ python -c " from module import Class; obj = Class(...)" # Real instantiation
1879+ config = RealConfig(real_data) # No mocks
1880+ result = api.actual_method () # Real workflow
1881+ ```
1882+
1883+ ** NOT sufficient** :
1884+ ``` python
1885+ # Unit test approach (can miss real issues)
1886+ @patch (" module.Class" )
1887+ def test_with_mock (mock_class ): # Never tests real instantiation
1888+ ...
1889+ ```
1890+
1891+ ### Lessons Learned
1892+
1893+ 1 . ** Always test like a user** - No mocks, real instantiation, actual workflows
1894+ 2 . ** High coverage isn't enough** - Need real usage validation
1895+ 3 . ** Mocks hide bugs** - Integration issues invisible to mocked tests
1896+ 4 . ** User requirements are wise** - This explicit requirement saved us from shipping broken code
1897+
1898+ ### Related
1899+
1900+ - Issue #1783 : Parallel Task Orchestrator
1901+ - PR #1784 : Implementation
1902+ - USER_PREFERENCES.md: Mandatory E2E testing requirement
1903+ - Commit dc90b350: Hashability fix
1904+
1905+ ### Recommendation
1906+
1907+ ** ENFORCE mandatory user testing** for ALL features:
1908+ - Test with ` uvx --from git+... ` (no local state)
1909+ - Try actual user workflows (no mocks)
1910+ - Verify error messages and UX
1911+ - Document test results in PR
1912+
1913+ This discovery ** validates the user's explicit requirement** - mandatory user testing prevents production failures that unit tests miss.
1914+
0 commit comments