|
| 1 | +# Semantic Search Audit Report - claude-parser v2.0.0 |
| 2 | +Generated: 2025-09-16T02:30:00Z |
| 3 | +Tool: semantic-search MCP service |
| 4 | + |
| 5 | +## Executive Summary |
| 6 | +- **Total Violations Found**: 30 |
| 7 | +- **Critical Security Issues**: 1 (test credentials in production) |
| 8 | +- **Architecture Violations**: 25 |
| 9 | +- **Dead Code Modules**: ~10 empty directories |
| 10 | +- **Coverage Impact**: Dead code artificially lowering coverage from ~60% to 43% |
| 11 | + |
| 12 | +## Critical Issues (Fix Immediately) |
| 13 | + |
| 14 | +### 1. 🚨 **Test Credentials in Production** |
| 15 | +```python |
| 16 | +# test_folder/config/settings.py - SHIPPED TO USERS! |
| 17 | +DEBUG = True |
| 18 | +API_KEY = "test-key-12345" |
| 19 | +DATABASE_URL = "sqlite:///test.db" |
| 20 | +``` |
| 21 | +**Risk**: HIGH - Exposed test credentials in PyPI package |
| 22 | +**Fix**: Delete entire test_folder from production |
| 23 | + |
| 24 | +### 2. 🚨 **Test Files in Root Directory** |
| 25 | +``` |
| 26 | +test_single.py # Test file at root |
| 27 | +verify_spec.py # Verification script |
| 28 | +test_folder/ # Entire test directory |
| 29 | +test_archive/ # Archive directory |
| 30 | +``` |
| 31 | +**Risk**: MEDIUM - Unnecessary files increasing package size |
| 32 | +**Fix**: Move to tests/ or delete entirely |
| 33 | + |
| 34 | +## Architecture Violations by Pattern |
| 35 | + |
| 36 | +### @CUSTOM_CODE_ANTIPATTERN (5 files) |
| 37 | +Files implementing custom logic instead of using framework APIs: |
| 38 | + |
| 39 | +| File | Issue | Suggested Fix | |
| 40 | +|------|-------|--------------| |
| 41 | +| hooks/handlers.py | Lambda handlers with custom logic | Use Typer commands directly | |
| 42 | +| hooks/executor.py | Manual execution flow | Use framework's built-in executor | |
| 43 | +| test_single.py | Should not exist in production | Delete | |
| 44 | +| verify_spec.py | Should not exist in production | Delete | |
| 45 | + |
| 46 | +### @FRAMEWORK_BYPASS (5 files) |
| 47 | +Direct library imports instead of using facades: |
| 48 | + |
| 49 | +| File | Direct Import | Should Use | |
| 50 | +|------|---------------|------------| |
| 51 | +| hooks/request.py | `import json` directly | Use pydantic for JSON | |
| 52 | +| analytics/core.py | Direct pandas usage | Use polars (already in deps) | |
| 53 | +| queries/session_queries.py | Direct DuckDB SQL | Use query builder | |
| 54 | + |
| 55 | +### @LOC_ENFORCEMENT (5 potential violations) |
| 56 | +Files potentially exceeding 80 lines: |
| 57 | + |
| 58 | +| File | Status | Action | |
| 59 | +|------|--------|--------| |
| 60 | +| models/__init__.py | Need to verify | Check actual LOC | |
| 61 | +| hooks/handlers.py | Need to verify | Split if >80 | |
| 62 | +| hooks/app.py | Need to verify | Split if >80 | |
| 63 | + |
| 64 | +### @UTIL_FIRST_VIOLATION (5 files) |
| 65 | +Creating custom utilities without checking for existing ones: |
| 66 | + |
| 67 | +| File | Custom Utility | Existing Alternative | |
| 68 | +|------|----------------|---------------------| |
| 69 | +| hooks/extraction.py | Custom dict extraction | Use glom | |
| 70 | +| operations/core.py | File operations | Use pathlib | |
| 71 | +| hooks/handlers.py | Response handling | Use Typer | |
| 72 | + |
| 73 | +### @DRY_VIOLATION_ANTIPATTERN (5 files) |
| 74 | +Duplicate logic across files: |
| 75 | + |
| 76 | +| Duplicate Pattern | Files | Extract To | |
| 77 | +|-------------------|-------|------------| |
| 78 | +| Hook event handling | handlers.py, app.py | Single handler | |
| 79 | +| File operations | diff_ops.py, file_ops.py | Unified ops | |
| 80 | +| Session loading | Multiple files | Single loader | |
| 81 | + |
| 82 | +## Dead Code Analysis |
| 83 | + |
| 84 | +### Empty/Unused Directories |
| 85 | +``` |
| 86 | +claude_parser/ |
| 87 | +├── domain/ # EMPTY - only __pycache__ |
| 88 | +│ ├── delegates/ # EMPTY |
| 89 | +│ ├── entities/ # EMPTY |
| 90 | +│ ├── filters/ # EMPTY |
| 91 | +│ ├── interfaces/ # EMPTY |
| 92 | +│ ├── services/ # EMPTY |
| 93 | +│ └── value_objects/# EMPTY |
| 94 | +├── application/ # LIKELY UNUSED |
| 95 | +├── infrastructure/ # LIKELY UNUSED |
| 96 | +└── utils/ # LIKELY UNUSED |
| 97 | +``` |
| 98 | + |
| 99 | +### Unused Modules (0% coverage) |
| 100 | +- analytics/billing.py |
| 101 | +- analytics/litellm_adapter.py |
| 102 | +- Several others with <20% coverage |
| 103 | + |
| 104 | +## Remediation Plan |
| 105 | + |
| 106 | +### Phase 1: Critical Security (v2.0.1) - TODAY |
| 107 | +```bash |
| 108 | +# 1. Remove test files from production |
| 109 | +rm -rf test_folder/ test_archive/ test_single.py verify_spec.py |
| 110 | + |
| 111 | +# 2. Clean git history (remove sensitive data) |
| 112 | +git filter-branch --tree-filter 'rm -rf test_folder' HEAD |
| 113 | + |
| 114 | +# 3. Release v2.0.1 immediately |
| 115 | +./publish.sh # Bump to 2.0.1 |
| 116 | +``` |
| 117 | + |
| 118 | +### Phase 2: Clean Architecture (v2.1.0) - This Week |
| 119 | +```bash |
| 120 | +# 1. Remove empty domain folders |
| 121 | +rm -rf claude_parser/domain/ |
| 122 | +rm -rf claude_parser/application/ |
| 123 | +rm -rf claude_parser/infrastructure/ |
| 124 | +rm -rf claude_parser/utils/ |
| 125 | + |
| 126 | +# 2. Fix framework bypasses |
| 127 | +# - Replace direct imports with facades |
| 128 | +# - Use existing libraries (glom, polars, etc.) |
| 129 | + |
| 130 | +# 3. Extract duplicate logic |
| 131 | +# - Create single hook handler |
| 132 | +# - Unify file operations |
| 133 | +``` |
| 134 | + |
| 135 | +### Phase 3: Improve Coverage (v2.2.0) - Next Week |
| 136 | +```bash |
| 137 | +# 1. Remove dead code from coverage calculation |
| 138 | +# 2. Add black box tests for actual API |
| 139 | +# 3. Target: 80% coverage of USED code |
| 140 | +``` |
| 141 | + |
| 142 | +## Validation Checklist |
| 143 | + |
| 144 | +Before proceeding, validate: |
| 145 | +- [ ] Confirm test_folder/ contains no production code |
| 146 | +- [ ] Verify domain/ folders are truly empty |
| 147 | +- [ ] Check if any imports reference deleted modules |
| 148 | +- [ ] Ensure CI/CD still passes after cleanup |
| 149 | +- [ ] Confirm package still installs correctly |
| 150 | + |
| 151 | +## Commands to Validate |
| 152 | + |
| 153 | +```bash |
| 154 | +# Check what's actually imported |
| 155 | +grep -r "from claude_parser.domain" claude_parser/ |
| 156 | +grep -r "from claude_parser.application" claude_parser/ |
| 157 | +grep -r "from claude_parser.infrastructure" claude_parser/ |
| 158 | + |
| 159 | +# Check file sizes |
| 160 | +find claude_parser -name "*.py" -exec wc -l {} \; | sort -rn | head -20 |
| 161 | + |
| 162 | +# Test package without dead code |
| 163 | +pip install -e . |
| 164 | +python -c "from claude_parser import load_session; print('✅')" |
| 165 | +``` |
| 166 | + |
| 167 | +## Next Steps |
| 168 | + |
| 169 | +1. **Review this report** with semantic-search findings |
| 170 | +2. **Validate** each finding manually |
| 171 | +3. **Execute Phase 1** immediately (security fix) |
| 172 | +4. **Plan Phase 2** based on validation results |
| 173 | +5. **Track progress** in this document |
| 174 | + |
| 175 | +## Metrics to Track |
| 176 | + |
| 177 | +| Metric | Current | Target | After Cleanup | |
| 178 | +|--------|---------|--------|---------------| |
| 179 | +| Package Size | ? MB | <1 MB | TBD | |
| 180 | +| File Count | 64 | <40 | TBD | |
| 181 | +| Coverage | 43.77% | 80% | TBD | |
| 182 | +| Violations | 30 | 0 | TBD | |
| 183 | +| Dead Folders | ~10 | 0 | TBD | |
| 184 | + |
| 185 | +--- |
| 186 | +*This audit was generated using semantic-search MCP service - first production use* |
0 commit comments