|
| 1 | +# AI Mistake Detection System - Specification |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This specification defines a comprehensive system for detecting and preventing common AI mistakes in code generation and system interaction. Born from real-world failures (like hardcoding paths and creating test files in production), this system provides a generalized, extensible framework for catching AI stupidity. |
| 6 | + |
| 7 | +## Document Structure |
| 8 | + |
| 9 | +### Core Documents |
| 10 | + |
| 11 | +1. **[Claude Watcher](./claude_watcher.md)** - The original rant and specific detection strategies |
| 12 | + - Real examples of AI mistakes |
| 13 | + - Specific detection patterns |
| 14 | + - Multi-model consensus approach |
| 15 | + - Emergency stop mechanisms |
| 16 | + |
| 17 | +2. **[Generalized Mistake Detection](./generalized_mistake_detection.md)** - Extensible framework design |
| 18 | + - Mistake categorization system |
| 19 | + - Plugin-based detector architecture |
| 20 | + - Learning and evolution capabilities |
| 21 | + - Multi-model validation framework |
| 22 | + |
| 23 | +3. **[Integration Design](./integration_design.md)** - Pipeline safety system integration |
| 24 | + - Step Reviewer enhancements |
| 25 | + - Pattern Detector additions |
| 26 | + - Intervention system updates |
| 27 | + - Unified configuration and monitoring |
| 28 | + |
| 29 | +## The Problem We're Solving |
| 30 | + |
| 31 | +AI assistants make predictable mistakes: |
| 32 | +- **Hardcoding paths**: `/home/user/.pyenv/shims/python3` |
| 33 | +- **Test file pollution**: Creating `test_adapter.ex` in `lib/` |
| 34 | +- **Environmental assumptions**: Assuming commands exist in PATH |
| 35 | +- **Context ignorance**: Ignoring project structure and conventions |
| 36 | +- **Security violations**: Unsafe permissions, exposed credentials |
| 37 | + |
| 38 | +## System Architecture |
| 39 | + |
| 40 | +```mermaid |
| 41 | +graph LR |
| 42 | + A[AI Action] --> B[Mistake Detectors] |
| 43 | + B --> C[Pattern Matching] |
| 44 | + B --> D[Multi-Model Review] |
| 45 | + B --> E[Context Validation] |
| 46 | + |
| 47 | + C --> F[Decision Engine] |
| 48 | + D --> F |
| 49 | + E --> F |
| 50 | + |
| 51 | + F --> G{Safe?} |
| 52 | + G -->|Yes| H[Execute] |
| 53 | + G -->|No| I[Block/Educate] |
| 54 | + |
| 55 | + I --> J[Learning System] |
| 56 | + J --> B |
| 57 | +``` |
| 58 | + |
| 59 | +## Key Features |
| 60 | + |
| 61 | +### 1. Mistake Categories |
| 62 | + |
| 63 | +- **Environmental Assumptions**: Hardcoded paths, executable assumptions |
| 64 | +- **Code Organization**: Wrong file locations, naming violations |
| 65 | +- **Security Violations**: Unsafe operations, credential exposure |
| 66 | +- **Context Ignorance**: Out-of-scope changes, constraint violations |
| 67 | + |
| 68 | +### 2. Detection Strategies |
| 69 | + |
| 70 | +- **Pattern-Based**: Regex and rule matching |
| 71 | +- **Behavioral**: Detecting patterns like tunnel vision |
| 72 | +- **Multi-Model Consensus**: Multiple AIs reviewing each other |
| 73 | +- **Learning-Based**: Evolving patterns from detected mistakes |
| 74 | + |
| 75 | +### 3. Intervention Types |
| 76 | + |
| 77 | +- **Block**: Stop dangerous actions |
| 78 | +- **Educate**: Teach correct patterns |
| 79 | +- **Auto-Fix**: Correct simple mistakes |
| 80 | +- **Human Review**: Escalate complex issues |
| 81 | + |
| 82 | +## Quick Start |
| 83 | + |
| 84 | +### Basic Configuration |
| 85 | + |
| 86 | +```yaml |
| 87 | +mistake_detection: |
| 88 | + enabled: true |
| 89 | + |
| 90 | + detectors: |
| 91 | + environmental: |
| 92 | + patterns: [hardcoded_paths, executable_assumptions] |
| 93 | + |
| 94 | + organization: |
| 95 | + source_dirs: ["lib", "src"] |
| 96 | + test_dirs: ["test", "spec"] |
| 97 | + |
| 98 | + security: |
| 99 | + critical_patterns: true |
| 100 | +``` |
| 101 | +
|
| 102 | +### Example Detection |
| 103 | +
|
| 104 | +```elixir |
| 105 | +# This would be caught |
| 106 | +def executable_path, do: "/home/user/.pyenv/shims/python3" |
| 107 | +# Error: Hardcoded user-specific path detected |
| 108 | + |
| 109 | +# This would be blocked |
| 110 | +# Creating: lib/my_test.ex |
| 111 | +# Error: Test files belong in test/ directory |
| 112 | + |
| 113 | +# This would trigger security alert |
| 114 | +System.cmd("chmod", ["777", file]) |
| 115 | +# Error: Overly permissive file permissions |
| 116 | +``` |
| 117 | + |
| 118 | +## Integration with Pipeline Safety |
| 119 | + |
| 120 | +The mistake detection system enhances the existing safety reviewer: |
| 121 | + |
| 122 | +```elixir |
| 123 | +# In pipeline configuration |
| 124 | +safety: |
| 125 | + reviewer: |
| 126 | + enabled: true |
| 127 | + |
| 128 | + mistake_detection: |
| 129 | + enabled: true |
| 130 | + learning: true |
| 131 | + |
| 132 | + patterns: |
| 133 | + - hardcoded_paths |
| 134 | + - test_pollution |
| 135 | + - assumption_cascade |
| 136 | +``` |
| 137 | + |
| 138 | +## Extensibility |
| 139 | + |
| 140 | +### Custom Detector |
| 141 | + |
| 142 | +```elixir |
| 143 | +defmodule MyDetector do |
| 144 | + @behaviour MistakeDetector |
| 145 | + |
| 146 | + def detect(action, context) do |
| 147 | + if violates_my_rule?(action) do |
| 148 | + [%{type: :custom, severity: :high, message: "..."}] |
| 149 | + else |
| 150 | + [] |
| 151 | + end |
| 152 | + end |
| 153 | +end |
| 154 | +``` |
| 155 | + |
| 156 | +### Pattern DSL |
| 157 | + |
| 158 | +```elixir |
| 159 | +defpattern :my_pattern do |
| 160 | + match ~r{bad_pattern} |
| 161 | + severity :high |
| 162 | + message "Don't do this" |
| 163 | + auto_fix &fix_function/1 |
| 164 | +end |
| 165 | +``` |
| 166 | + |
| 167 | +## Monitoring |
| 168 | + |
| 169 | +Track effectiveness with built-in metrics: |
| 170 | + |
| 171 | +``` |
| 172 | +Mistakes Prevented: |
| 173 | +├── Hardcoded Paths: 47 |
| 174 | +├── Test Pollution: 12 |
| 175 | +├── Security Issues: 3 |
| 176 | +└── Context Violations: 28 |
| 177 | +
|
| 178 | +Learning Progress: |
| 179 | +├── Patterns Created: 15 |
| 180 | +├── Detection Accuracy: 94% |
| 181 | +└── False Positive Rate: 3% |
| 182 | +``` |
| 183 | + |
| 184 | +## Implementation Status |
| 185 | + |
| 186 | +This is a specification for implementation. The system will be built in phases: |
| 187 | + |
| 188 | +1. **Phase 1**: Basic pattern detection (Week 1) |
| 189 | +2. **Phase 2**: Enhanced interventions (Week 2) |
| 190 | +3. **Phase 3**: Learning system (Week 3-4) |
| 191 | +4. **Phase 4**: Multi-model consensus (Week 5-6) |
| 192 | + |
| 193 | +## Why This Matters |
| 194 | + |
| 195 | +Without this system: |
| 196 | +- AI creates test files in production code |
| 197 | +- Hardcodes environment-specific paths |
| 198 | +- Makes unsafe assumptions |
| 199 | +- Violates project conventions |
| 200 | +- Repeats the same mistakes |
| 201 | + |
| 202 | +With this system: |
| 203 | +- Mistakes are caught before execution |
| 204 | +- AI learns from corrections |
| 205 | +- Patterns evolve and improve |
| 206 | +- Multiple models validate decisions |
| 207 | +- Developers maintain control |
| 208 | + |
| 209 | +## Next Steps |
| 210 | + |
| 211 | +1. Review and approve specifications |
| 212 | +2. Implement Phase 1 pattern detection |
| 213 | +3. Integration with Step Reviewer |
| 214 | +4. Deploy to development environment |
| 215 | +5. Collect metrics and refine |
| 216 | + |
| 217 | +--- |
| 218 | + |
| 219 | +*"The only way to prevent AI stupidity is to have systems watching for our stupidity patterns."* |
0 commit comments