refactor(docs): code analysis engine
- docs: update TODO.md
- docs: update context.md
- docs: update context_final.md
- docs: update context_fixed.md
- docs: update fast_analysis_report.md
- config: update goal.yaml
- update debug/.code2flow_cache/__init___067a3ea9a806bdcd.pkl
- update debug/.code2flow_cache/__init___06ee3b304cbac344.pkl
- update debug/.code2flow_cache/__init___07004ae5fc0b63a4.pkl
- update debug/.code2flow_cache/__init___092c164e1ea3ed2a.pkl
- update debug/.code2flow_cache/__init___1306939d2650ad0a.pkl
- update debug/.code2flow_cache/__init___1435b739d4a93c01.pkl
- update debug/.code2flow_cache/__init___1a3f34073e505d94.pkl
- update debug/.code2flow_cache/__init___20b71d7ad5e01760.pkl
- update debug/.code2flow_cache/__init___385814d063e205eb.pkl
- update debug/.code2flow_cache/__init___563e1960e3f8fe02.pkl
- ... and 222 more
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
-
Optimized ProjectAnalyzer with caching and parallel processing
- FileCache with TTL support for AST parsing results
- Parallel file analysis using ProcessPoolExecutor
- Configurable performance settings (FAST_CONFIG, DETAILED_CONFIG)
- Memory-efficient depth limiting for CFG generation
-
Enhanced Filtering
- FastFileFilter with glob pattern matching
- Exclude test files, private methods, properties
- Min function lines threshold
- Configurable include/exclude patterns
-
Comprehensive Code Model
- FlowNode, FlowEdge for CFG representation
- FunctionInfo, ClassInfo, ModuleInfo with metadata
- Pattern detection (recursion, state machines)
- Compact JSON/YAML output with optional full details
-
Query Normalization (1a-1e)
- Unicode NFKC normalization
- Lowercase conversion
- Punctuation removal
- Whitespace normalization
- Stopword removal (multilingual)
-
Intent Matching (2a-2e)
- Fuzzy matching with configurable algorithms
- Keyword matching with weighted scoring
- Context window scoring for disambiguation
- Multi-intent resolution strategies (best_match, combine, sequential)
-
Entity Resolution (3a-3e)
- Type-based entity extraction (function, class, module, variable, file)
- Name matching with similarity threshold
- Context-aware disambiguation
- Hierarchical resolution (Class.method -> method)
- Alias resolution (short -> qualified names)
-
Pipeline Integration (4a-4e)
- Orchestration with stage tracking
- Result aggregation and confidence scoring
- Fallback handling for low-confidence queries
- Formatted output with action recommendations
-
Multilingual Support
- English and Polish query support
- Cross-language fuzzy matching
- Language-specific stopwords
- YAML-driven configuration
- JSON Exporter - Machine-readable analysis output
- YAML Exporter - Human-readable with compact/full modes
- Mermaid Exporter - Flowchart and call graph visualization
- LLMPromptExporter - LLM-ready analysis summaries
- GraphVisualizer - NetworkX/matplotlib PNG generation
- Improved argument parsing with subcommands
- Automatic PNG generation from Mermaid files
- LLM flow generation command
- Verbose output with progress reporting
- Multiple output format support
-
Comprehensive Test Suite
- Unit tests for all core components
- Edge case tests (empty projects, syntax errors, unicode)
- Performance benchmarks
- Integration tests (NLP + Analysis workflow)
- NLP pipeline tests (steps 1a-4e validation)
-
Benchmarking
- Performance tests for large projects
- Cache effectiveness measurement
- Parallel vs sequential comparison
- Memory usage validation
- Complete API documentation
- Usage examples and tutorials
- Performance optimization guide
- Multilingual query examples
- Configuration reference
- Refactored monolithic flow.py into modular package structure
- Improved error handling throughout codebase
- Enhanced type hints for better IDE support
- Updated setup.py for PyPI publication readiness
- Import errors in CLI module
- Attribute mismatches between models and exporters
- Parallel processing pickle compatibility issues
- FlowEdge attribute access (condition -> conditions)
- Initial project structure
- Basic AST-based code analysis
- Control flow graph generation
- Call graph extraction
- Pattern detection (recursion, loops)
- Mermaid diagram export
- Command-line interface
- Initial test suite
- Semantic code search using embeddings
- Advanced pattern detection (factory, singleton, observer)
- Interactive web UI (Streamlit/Gradio)
- VS Code extension
- Support for additional languages (JavaScript, TypeScript)
- Real-time code analysis via file watching
- Integration with Git for diff analysis
- Custom pattern definition via YAML
- Plugin system for third-party extensions
- Docker container for easy deployment
- Complete API stability
- Comprehensive security audit
- Enterprise features (SSO, audit logs)
- Performance optimizations for 100k+ LOC projects
- Full documentation with video tutorials
Please report issues and suggest features via GitHub Issues.
Developed by the STTS Project team.