Commit 79d29c4
authored
fix: enrichment application and contract extraction fixes (#94)
* fix: enrichment application and contract extraction improvements
- Fix enrichment not being applied when no source files changed
- Check for enrichment before early exit in _check_incremental_changes
- Mark bundle for regeneration when enrichment is provided
- Ensure bundle is saved after enrichment is applied
- Fix --force flag performance regression
- Skip hash checking when --force is used for contract extraction
- Process all features directly without expensive hash computation
- Significantly improves performance for large bundles with --force
- Fix type checking errors
- Fix possibly unbound variables (is_test_mode, compute_file_hash, features_with_files)
- Properly scope variables within conditional blocks
- Ensure all variables are initialized before use
- Add comprehensive test coverage
- Add integration tests for enrichment and contract extraction bugs
- Add unit tests for contract extraction logic
- All tests now passing (11/11 integration tests)
- Update version to 0.23.1
- Sync version across pyproject.toml, setup.py, src/__init__.py, src/specfact_cli/__init__.py
- Update CHANGELOG.md with bug fixes and test coverage additions
* fix: keep progress bars visible during enhanced analysis
- Remove progress.remove_task() calls for relationship and graph analysis
- Keep final progress bars visible with completion state instead of removing them
- Prevents blank lines from appearing when progress bars disappear
- Progress bars now show final completion message and remain visible
* feat: show current feature/contract in contract extraction progress
- Add detailed progress updates showing which feature is currently being processed
- For sequential mode: show feature name before and after processing
- For parallel mode: show completed feature name and pending count
- Progress now displays: 'Extracting contract from FEATURE-NAME... (X/Total, Y pending)'
- Improves visibility during long-running contract extraction operations
- Helps identify which features are taking longer to process
* docs: add contract extraction performance analysis
- Document current performance bottlenecks
- Identify AST parsing as primary bottleneck
- Propose file-level caching optimization (3-5x speedup)
- Suggest batch processing and early exit optimizations
- Estimate 5-10x overall speedup potential
* perf: implement AST caching and early exit optimizations for contract extraction
- Add file-level AST caching to prevent redundant parsing (3-5x speedup)
- Cache AST trees and file hashes for reuse across features
- Invalidate cache when file content changes
- Thread-safe cache operations
- Add early exit optimization for non-API files (1.5-2x speedup)
- Quick regex check before expensive AST parsing
- Skip files without API endpoints (models, utilities)
- Pre-compiled regex patterns for performance
- Add comprehensive tests for optimizations:
- Test AST caching prevents redundant parsing
- Test early exit skips non-API files
- Test cache invalidation on file changes
Expected overall improvement: 5-10x speedup for contract extraction
For SQLAlchemy (320 features): ~8 minutes -> ~45-90 seconds
* fix: disable aggressive early exit that skipped all SQLAlchemy files
- Early exit optimization was too aggressive for ORM/class-based codebases
- SQLAlchemy doesn't use FastAPI/Flask decorators, so all files were skipped
- Contract extractor also processes class-based APIs and interfaces
- Disabled early exit to restore functionality (AST caching still provides 3-5x speedup)
- Updated test to reflect that early exit detection works but is disabled in extraction
Fixes: 0 contracts generated when using --force flag
* perf: optimize class-based extraction to skip non-API classes and limit method processing
- Skip non-API class types: Protocol, TypedDict, Enum, ABC, Mixin, Base, Meta, Descriptor, Property
- Skip classes that inherit from non-API base types
- Filter methods more selectively: skip utility methods (processor, adapter, factory, etc.)
- Limit methods processed per class to 15 (skip classes with more methods)
- Only process methods that strongly suggest API endpoints (CRUD patterns or short names)
Performance improvements:
- FEATURE-TYPERESOLVE: Skips Protocol/TypedDict classes and TypeEngine utility methods
- FEATURE-COLLECTIONADAPTER: Skips non-API classes, processes only relevant methods
- Reduces processing time for large ORM/library codebases
Expected improvement: 2-3x faster for features with many utility classes
* Apply format
* fix: resolve CI test failures and deprecation warnings
- Fix interface extraction: Check for interfaces (ABC/Protocol with abstract methods) BEFORE skipping base classes
- Interfaces should be processed for contract extraction
- Non-interface ABC/Protocol classes are still skipped for performance
- Fix progress callback tests: Update tests to expect two calls (total, then completed+description)
- Progress callback now sets total on first call, then updates with completed count
- Fix deprecation warnings:
- Suppress ast.NameConstant deprecation warning (Python 3.8+ compatibility)
- Replace datetime.utcnow() with datetime.now(UTC) for Python 3.11+ compatibility
- Use timezone.utc fallback for older Python versions
Fixes:
- test_extract_interface_abstract_methods (was skipping ABC interfaces)
- test_create_callback_with_prefix (expected single call, got two)
- test_create_callback_without_prefix (expected single call, got two)
- DeprecationWarning: ast.NameConstant (Python 3.14)
- DeprecationWarning: datetime.utcnow() (future removal)
* fix: correct UTC import for type checking
- Import timezone before try/except block to ensure UTC is defined
- Add type: ignore comment to suppress false positive type checker warning
- Fixes type checker error: 'UTC' is unbound in except block
- Maintains backward compatibility with Python < 3.11
* perf: reduce lock contention in contract extraction for parallel processing
Critical performance fix for large feature sets (320+ features):
1. **Moved file I/O outside lock**: File reading and hash calculation now happen outside the lock, eliminating I/O-bound blocking
- Lock only held for cache lookups and updates (minimal scope)
- Double-check pattern prevents race conditions
2. **Removed unnecessary locks from openapi_spec writes**:
- Each feature has its own openapi_spec dict (no sharing)
- Python dict assignment is atomic for single operations
- Removed locks from: path initialization, schema addition, security schemes, operation addition
- Only cache operations (shared across features) use lock now
3. **Separated cache lock**: Renamed _lock to _cache_lock for clarity
- Cache is shared resource (needs protection)
- openapi_spec dicts are per-feature (no shared lock needed)
Performance impact:
- Before: Lock held during file I/O (10-100ms per file) blocks all other threads
- After: Lock only held for cache access (<1ms), file I/O happens in parallel
- Expected: 5-10x faster for 320 features with parallel processing
This fixes the 3-hour extraction time for 320 contracts by eliminating lock contention bottleneck.
* perf: optimize AST cache to avoid redundant file reads
- Reuse file content when checking hash vs parsing
- Read file only once per cache check/parse cycle
- Reduces I/O operations by 50% for cache hits
- Maintains thread safety with minimal lock scope
This addresses the performance issue where contract extraction
was extremely slow (5+ hours) by eliminating redundant file I/O.
* docs: update CHANGELOG for 0.23.1 with contract extraction performance fixes
* Fix slowness bug in contract extraction
* fix: correct SourceTracking import in profiling script
- Changed import from specfact_cli.models.project to specfact_cli.models.source_tracking
- Fixes CrossHair import error in contract validation workflow
- Resolves ImportError: cannot import name 'SourceTracking'
* Fix parallel processing of contract analysis
* Revert venv config
---------
Co-authored-by: Dominikus Nold <[email protected]>1 parent 1f88cb7 commit 79d29c4
File tree
18 files changed
+2226
-189
lines changed- src
- specfact_cli
- analyzers
- commands
- generators
- sync
- utils
- tests
- integration/commands
- unit
- commands
- generators
- utils
- tools
18 files changed
+2226
-189
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
12 | 69 | | |
13 | 70 | | |
14 | 71 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
| 12 | + | |
13 | 13 | | |
14 | 14 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
260 | 260 | | |
261 | 261 | | |
262 | 262 | | |
263 | | - | |
264 | | - | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
265 | 272 | | |
266 | 273 | | |
267 | 274 | | |
| |||
0 commit comments