All notable changes to this repository are documented here.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Business Logic Persistence —
ReverseEngineeringProcessandChunkedReverseEngineeringProcessnow persist extractedBusinessLogicrecords to a newbusiness_logicSQLite table viaIMigrationRepository.SaveBusinessLogicAsync. AddedGetBusinessLogicAsyncandDeleteBusinessLogicAsynctoIMigrationRepository,SqliteMigrationRepository, andHybridMigrationRepository. - Business Logic Injection into Conversion Prompts — All four converter agents (
JavaConverterAgent,CSharpConverterAgent,ChunkAwareJavaConverter,ChunkAwareCSharpConverter) now receive extractedBusinessLogicrecords viaSetBusinessLogicContext()(new method onICodeConverterAgent). In full-pipeline runs,SmartMigrationOrchestratorwires RE output directly into conversion;--reuse-reloads the same context from a previous persisted RE run. A sharedFormatBusinessLogicContext()helper inAgentBaseformats the context for all four converters. --reuse-reCLI flag — When combined with--skip-reverse-engineering, loads business logic from the latest persisted RE run and injects it into conversion prompts.doctor.sh convert-onlynow prompts interactively for this choice.- REST API:
GET/DELETE /api/runs/{runId}/business-logic— Returns per-file business logic summary (story/feature/rule counts); DELETE removes persisted results to allow re-running RE for that run. - Portal: per-run
🔬 RE Resultsbutton — Shows the business logic summary table for a run and allows deletion of persisted results directly from the UI. - RE Results in Portal Chat — Chat endpoint injects business purpose, user stories, features, and business rules from the
business_logictable into the AI prompt context. Updated AI system prompt accordingly.
- Empty Technical Analysis in RE output —
ReverseEngineeringProcessandChunkedReverseEngineeringProcessnow fall back to renderingRawAnalysisDatawhen structuredCobolAnalysisfields are unpopulated. - Total Features always 0 —
BusinessLogicExtractorAgent.ExtractFeatures()now matches### Use Case N:and### Operationheadings in addition to### Feature:, reflecting the actual AI prompt output.
- Dependency mapping runs once per full run — RE processes (
ReverseEngineeringProcess,ChunkedReverseEngineeringProcess) now include a dedicated dependency mapping step (step 4/5) and store the result onReverseEngineeringResult.DependencyMap.MigrationProcessandChunkedMigrationProcessaccept aSetDependencyMap()call and skipAnalyzeDependenciesAsyncwhen a map is already provided.SmartMigrationOrchestrator.RunAsyncthreadsexistingDependencyMapthrough to both migration paths. Dependency output files (dependency-map.json,dependency-diagram.md) are now generated in the RE output folder as well as the migration output folder. doctor.sh— Updatedconvert-onlyto prompt for--reuse-re; corrected portal navigation references to match current UI ('📄 Reverse Engineering Results').
- Automated Documentation Checker — New GitHub Actions workflow (
documentation-updater) that reviews code changes on every push and PR tomain, identifies missing or outdated documentation, and notifies the responsible author via PR comments or issues. - Speed Profile Selection - New interactive prompt in
doctor.shlets you choose between four speed profiles before running migrations, reverse engineering, or conversion-only:- TURBO — Low reasoning on ALL files with no exceptions. 65K token ceiling, parallel file conversion (4 workers), 200ms stagger delay. Designed for testing and smoke runs where speed matters more than quality.
- FAST — Low reasoning on most files, medium only on the most complex ones. 32K token cap, parallel conversion (3 workers), 500ms stagger. Good for quick iterations and proof-of-concept runs.
- BALANCED (default) — Uses the three-tier content-aware reasoning system. Simple files get low effort, complex files get high effort. Parallel conversion (2 workers), 1s stagger.
- THOROUGH — Maximum reasoning on all files regardless of complexity. Parallel conversion (2 workers), 1.5s stagger. Best for critical codebases where accuracy matters more than speed.
- Shared
select_speed_profile()function — Called fromrun_migration(),run_reverse_engineering(), andrun_conversion_only(). SetsCODEX_*environment variables that are picked up byProgram.csOverrideSettingsFromEnvironment()at startup — no C# changes needed. - Adaptive Re-Chunking on Output Exhaustion — When reasoning exhaustion retries fail (all escalation attempts exhausted),
AgentBasenow automatically splits the COBOL source at the best semantic boundary (DIVISION > SECTION > paragraph > midpoint) and processes each half independently with a 50-line context window (second half begins 50 lines before the split point for continuity). Results are merged with duplicate package/import/class removal and validated for truncation signals. This solves the TURBO/FAST paradox where small output token caps caused repeated exhaustion failures rather than triggering the existing input-size-based chunking. - Parallel File Conversion — All 4 converter agents (
ChunkAwareJavaConverter,ChunkAwareCSharpConverter,JavaConverterAgent,CSharpConverterAgent) now support parallel file conversion viaSemaphoreSlim-based concurrency control. Controlled byMaxParallelConversionsetting (default: 2). TURBO uses 4 workers, FAST uses 3, BALANCED/THOROUGH use 2. - Environment Variable Overrides for Timing — New env vars
CODEX_STAGGER_DELAY_MS,CODEX_MAX_PARALLEL_CONVERSION, andCODEX_RATE_LIMIT_SAFETY_FACTORallow fine-tuning of parallelism and rate limiting without code changes.
- Settings Injection Bug — All agent constructors in
MigrationProcess.cs,ChunkedMigrationProcess.cs, andProgram.cswere missing thesettingsparameter, causingAppSettingsto always benullinside agents. As a result, runtime configuration (including environment variable overrides such asCODEX_MAX_PARALLEL_CONVERSION) could not be applied, and agents fell back to the defaultMaxParallelConversionvalue of 1 (sequential). All 10 constructor call sites now passsettingscorrectly so both static config and env var overrides take effect as intended. - Hardcoded Rate Limit Safety Margin —
RateLimitTracker.SafetyMarginwas hardcoded at 0.90, ignoring the configurableRateLimitSafetyFactorfromChunkingSettings. Now accepts asafetyMarginparameter wired from settings (TURBO=0.85, default=0.70).
- README.md — Added Speed Profile documentation with profile comparison table
- doctor.sh — Added
select_speed_profile()function and integrated into all three run commands. TURBO/FAST profiles now export parallel conversion and stagger delay env vars. - TokenHelper.cs —
CalculateRequestDelaydelay floor lowered from hardcoded 15s to configurable (default 2s, minimum 500ms) - ChunkingSettings.cs — Added
MaxParallelConversionproperty (default 1)
- Line-based chunking fallback for data-only copybooks (no DIVISION/SECTION/PARAGRAPH)
SemaphoreSlimdisposal (using var) and over-release prevention (lockHeldflag)- Config script injection:
eval→envsubstinload-config.sh - Port cleanup:
lsof -sTCP:LISTENto avoid killing client connections
- Chunking stress test for line-based fallback on large copybooks
- Removed "Spec-Driven Migration" workflow; focused on "Deep Code Analysis" pipeline
- Updated architecture diagrams for Deep SQL Analysis flow (Regex → SQLite → Portal)
- Cleaned up deprecated
doctor.shfunctions
BusinessLogicExtractorAgentauth: switched toResponsesApiClient(HTTP 401 fix)- Strict regex for class extraction, preventing AI comment artifacts (e.g.,
Completes.java)
- Smart Chunking - Semantic chunking for large files (>3K lines), parallel processing (6 workers), cross-chunk
SignatureRegistry - Portal chunks tab with real-time progress;
doctor.sh chunking-healthcommand - DB tables:
chunk_metadata,forward_references,signatures,type_mappings
- 88% code loss on files >50K LOC (now routed through chunked process)
- Stale run status, duplicate DB paths, portal port conflicts
MaxLinesPerChunk: 1500,OverlapLines: 300,MaxParallelAnalysis: 6,TokenBudgetPerMinute: 300K
- C# .NET Support - Dual-language output (Java Quarkus or C# .NET) via
CSharpConverterAgent - Migration Reports - Portal, CLI, or API (
/api/runs/{runId}/report) - Mermaid Diagrams - Interactive flowcharts, sequence, class, and ER diagrams
- Enhanced dependency tracking (CALL, COPY, PERFORM, EXEC SQL, READ/WRITE)
- Unified
output/directory; renamedcobol-source/→source/ - GPT-5 Mini (32K tokens) configuration
- Reverse Engineering -
reverse-engineercommand,BusinessLogicExtractorAgent, glossary support - Hybrid Database - SQLite + Neo4j via
HybridMigrationRepository - Portal UI - Three-panel dashboard with run selector, graphs, AI chat (port 5028)
- REST API -
/api/runinfo,/api/runs/all,/api/graph,/api/chat - DevContainer auto-start, 9 MCP resources per run
- Port standardization: 5028 / 7474 / 7687
doctor.shauto-fixes, .NET 9 detection, Windows compatibility
- Initial release: COBOL → Java Quarkus migration with AI agents (CobolAnalyzer, JavaConverter, DependencyMapper)
- SQLite persistence, MCP server,
doctor.shCLI, Azure OpenAI (GPT-4), Dev container
- Neo4j integration → hybrid database (SQLite + Neo4j), dependency graph visualization
- McpChatWeb portal (three-panel dashboard, 9 MCP resources, run selector, dynamic graphs)
- .NET 9 standardization, multi-run query support