mdite is a comprehensive documentation toolkit built as a modular system with clear separation of concerns. The architecture follows a layered approach, separating CLI concerns from core business logic and shared utilities.
Core Philosophy: mdite treats documentation as a connected system (graph), not isolated files. This graph foundation enables all current and future features: validation, dependency analysis, search, output, and more.
Purpose: Parse command-line arguments and options, coordinate user interaction
Location: src/cli.ts, src/commands/
Key files:
cli.ts- Main CLI setup with Commander.js, signal handlers, global optionscommands/lint.ts- Validation command (structural integrity)commands/deps.ts- Dependency analysis commandcommands/config.ts- Configuration management commandscommands/init.ts- Initialize configuration- Future:
commands/query.ts,commands/cat.ts,commands/toc.ts
Responsibilities:
- Parse CLI arguments and options (including Unix-friendly flags)
- Load and merge configuration
- Initialize logger with appropriate verbosity and output modes
- Handle Unix signals (SIGINT, SIGTERM, SIGPIPE) gracefully
- Execute commands and handle errors with proper exit codes
- Format output for user consumption (respecting stdout/stderr separation)
Purpose: Business logic and orchestration of documentation system operations
Location: src/core/
Key files:
doc-linter.ts- Main orchestrator that coordinates all operationsgraph-analyzer.ts- Graph foundation: Dependency graph building and traversal (enables all features)link-validator.ts- Link and anchor validationmarkdown-cache.ts- Performance optimization: Centralized cache for markdown parsing and derived dataconfig-manager.ts- Multi-layer configuration managementremark-engine.ts- Content linting with remark pluginsreporter.ts- Result formatting and output- Future: Query engine, content output processor, TOC generator
Responsibilities:
- Build documentation dependency graph (foundation for all features)
- Validate links (files and anchors)
- Detect orphaned files
- Analyze dependencies and relationships
- Run content linting with remark
- Cache markdown parsing and derived data (eliminates redundant operations)
- Aggregate and return results
- Future: Search/query operations, content output, TOC generation
Purpose: Define data structures and schemas
Location: src/types/
Key files:
config.ts- Configuration schemas and types (Zod-based)graph.ts- Dependency graph data structureresults.ts- Lint results and error typeserrors.ts- Lint error message formatexit-codes.ts- Standard Unix exit codes enum
Responsibilities:
- Define type-safe configuration schemas
- Provide runtime validation with Zod
- Structure lint results and errors
- Define standard exit codes for Unix compatibility
- Ensure type safety across the codebase
Purpose: Shared utilities and helpers
Location: src/utils/
Key files:
logger.ts- Unix-friendly logging with TTY detection, stdout/stderr separation, quiet/verbose modeserrors.ts- Custom error classes with exit codes and contexterror-handler.ts- Error handling middleware and utilitiesfs.ts- File system utilities (find markdown files, check existence)paths.ts- Path resolution for config files (user/project)slug.ts- GitHub-style heading slugificationreporter.ts- Format lint results for text/JSON output with stream separation
Responsibilities:
- Provide Unix-friendly logging (TTY detection, color control, quiet/verbose modes)
- Separate data (stdout) from messages (stderr) for pipe compatibility
- Handle errors with proper context and exit codes
- Manage file system operations
- Format output for different consumers
1. User runs CLI command
↓
2. CLI parses arguments (Commander.js)
↓
3. ConfigManager loads and merges config
(Defaults → User Config → Project Config → CLI Options)
↓
4. GraphAnalyzer builds dependency graph from entrypoint
(Foundation step - used by ALL commands)
↓
5. Command-specific operations:
lint:
├─ GraphAnalyzer detects orphaned files
├─ LinkValidator validates all links (files + anchors)
└─ RemarkEngine runs content linting
deps:
├─ Extract dependencies for target file
└─ Format as tree, list, or JSON
↓
6. Results aggregation
↓
7. Reporter formats results (text, JSON, tree, list)
├─ Data to stdout (pipeable)
└─ Messages to stderr (suppressible with --quiet)
↓
8. CLI sets appropriate exit code:
├─ 0 = Success
├─ 1 = Validation errors
├─ 2 = Usage errors
└─ 130 = Interrupted
All will leverage the same graph foundation built by GraphAnalyzer, with command-specific processing layers.
The configuration system uses a layered approach with clear priority:
Priority (highest to lowest):
- CLI Options - Flags passed on command line (
--entrypoint,--format, etc.) - Project Config -
.mditerc,mdite.config.js, orpackage.json#mdite - User Config -
~/.config/mdite/config.json(personal defaults) - Defaults - Built-in defaults from
src/types/config.ts
Each layer is merged into the next, with higher priority layers overriding lower ones.
The dependency graph is built using depth-first traversal:
1. Start with entrypoint file (e.g., README.md) at depth 0
2. Parse markdown to extract links
3. For each relative .md link:
- Resolve absolute path
- Skip if already visited (cycle detection)
- Add edge to graph with current depth
- Recursively visit target file at depth + 1 (if within maxDepth)
4. Return complete graph of reachable files
Depth Tracking: Each node in the graph tracks its depth from the entrypoint:
- Entrypoint is at depth 0
- Direct links from entrypoint are at depth 1
- Links from those files are at depth 2, etc.
- Files beyond maxDepth are not included in the graph
Depth Limiting: The --depth parameter (or depth config option) controls how far traversal goes:
unlimited(default) - Traverse all reachable files (maxDepth = Infinity)0- Only the entrypoint file (no links followed)1- Entrypoint + direct links only2- Entrypoint + direct links + links from those files- etc.
Use Cases for Depth Limiting:
- Progressive validation: Start with core docs (depth 1-2), expand gradually
- Performance: Limit scope for faster validation on large doc sets
- Focused validation: Validate only immediate dependencies of key files
Orphan Detection: After graph is built, find all markdown files in directory that are NOT in the graph. Files beyond maxDepth are considered orphans (not reachable within the specified depth limit).
Link validation handles three types of links:
- Extract all headings from current file
- Convert to GitHub-style slugs
- Check if anchor matches any heading
- Resolve relative path
- Check if file exists
- Report error if not found
- First validate file exists
- Then extract headings from target file
- Check if anchor matches any heading
mdite follows Unix philosophy and conventions for CLI tool design.
Pattern: Separate data from messages for pipe-friendly operation
stdout (file descriptor 1):
- Validation results (errors, warnings)
- JSON output
- Data intended for further processing
stderr (file descriptor 2):
- Informational messages (progress, summaries)
- Headers and separators
- Success/failure notifications
Implementation:
logger.log()→ stdout (always shown)logger.info(),logger.success(),logger.header()→ stderr (suppressed in --quiet)logger.error()→ stderr (always shown)
Benefits:
# Pipe data without progress messages
mdite lint --format json | jq '.'
# Suppress progress, keep only errors
mdite lint 2>/dev/null
# Grep errors without interference
mdite lint | grep "Dead link"Pattern: Auto-detect terminal capabilities and adjust output
function shouldUseColors(): boolean {
if ('NO_COLOR' in process.env) return false;
if ('FORCE_COLOR' in process.env) return true;
if (process.env.CI === 'true') return false;
return process.stdout.isTTY ?? false;
}Environment Variables:
NO_COLOR- Disable colors (respects no-color.org)FORCE_COLOR- Force colors even when not a TTYCI=true- Auto-disable colors in CI environments
CLI Flags:
--colors- Override detection, force colors--no-colors- Override detection, disable colors
Pattern: Use standard Unix exit codes for different scenarios
enum ExitCode {
SUCCESS = 0, // No errors
ERROR = 1, // Validation/operational errors
USAGE_ERROR = 2, // Invalid arguments/options
INTERRUPTED = 130, // SIGINT/SIGTERM (128 + 2)
}Usage:
# Success check
mdite lint && echo "Success"
# Failure check
mdite lint || echo "Failed"
# Capture exit code
mdite lint
echo $? # 0, 1, 2, or 130Pattern: Handle Unix signals gracefully
process.on('SIGINT', () => {
console.error('\nInterrupted');
process.exit(ExitCode.INTERRUPTED);
});
process.on('SIGTERM', () => {
console.error('\nTerminated');
process.exit(ExitCode.INTERRUPTED);
});
process.on('SIGPIPE', () => {
process.exit(ExitCode.SUCCESS);
});Benefits:
- Clean Ctrl+C handling
- Proper exit codes for signal termination
- SIGPIPE handling for broken pipes (e.g.,
mdite lint | head)
Pattern: Suppress informational output for scripting
class Logger {
private quiet: boolean;
info(message: string): void {
if (this.quiet) return; // Suppressed
console.error(`ℹ ${message}`);
}
error(message: string): void {
// Always shown, never suppressed
console.error(`✗ ${message}`);
}
}Usage:
# Scripting - only errors
mdite lint --quiet
# CI/CD - clean output
mdite lint --quiet --format jsonAll errors extend DocLintError base class with:
code- Machine-readable error codeexitCode- CLI exit code (0 = success, 1+ = failure)context- Additional metadata for debuggingcause- Original error (for error wrapping)
Error hierarchy:
DocLintError (base)
├── ConfigNotFoundError
├── InvalidConfigError
├── FileNotFoundError
├── DirectoryNotFoundError
├── FileReadError
├── FileWriteError
├── ValidationError
├── SchemaValidationError
├── GraphBuildError
├── DeadLinkError
├── DeadAnchorError
├── MarkdownParseError
├── FrontmatterParseError
├── InvalidArgumentError
├── MissingArgumentError
├── OperationCancelledError
└── TimeoutError
- Define rule logic in appropriate module (e.g.,
link-validator.ts) - Add rule name to
RuntimeConfig.rulestype - Update
DEFAULT_CONFIGwith default severity - Implement rule checking logic
- Add tests for the new rule
- Update documentation
- Create command file in
src/commands/(e.g.,commands/check.ts) - Register command in
src/cli.ts:import { checkCommand } from './commands/check.js'; program.addCommand(checkCommand());
- Add integration tests in
tests/integration/ - Update README with command documentation
- Update
RuntimeConfig.formattype insrc/types/config.ts - Implement formatter in
src/utils/reporter.ts - Add tests for new format
- Update CLI help text
- Update appropriate schema in
src/types/config.ts:UserConfigSchemafor user configProjectConfigSchemafor project configRuntimeConfigSchemafor final runtime config
- Zod will automatically validate at runtime
- Add tests for invalid configurations
- Test individual modules in isolation
- Mock dependencies
- Fast execution
- High coverage
- Test full workflows (CLI, commands)
- Use real file system (temp directories)
- Test error scenarios
- Slower but more comprehensive
setup.ts- Helper functions for test setuputils.ts- Test utilities (fixtures, assertions)mocks/- Mock objects (logger, etc.)fixtures/- Sample markdown files for testing
Location: examples/
Purpose: Runnable examples and smoke tests
examples/
├── 01-04: Core Examples (Phase 1)
├── 05-06: Real-World + Config Variations (Phase 2)
└── 07: Edge Cases (Phase 3)
Examples serve three purposes:
- User Documentation - Show how mdite works
- Manual Testing - Quick smoke tests during development
- Regression Testing - Verify behavior across releases
| Aspect | tests/fixtures/ | examples/ |
|---|---|---|
| Purpose | Automated unit tests | Manual demos + smoke tests |
| Audience | Developers (internal) | Users + Developers |
| Execution | Via Vitest | Via CLI |
| Documentation | Minimal | Comprehensive |
| Scope | Focused test cases | Realistic scenarios |
# Individual example
cd examples/01-valid-docs && mdite lint
# Full smoke test suite
cd examples && ./run-all-examples.shSee examples/README.md for details.
The MarkdownCache class eliminates redundant parsing operations:
- Content caching: File content read once, reused across operations
- AST caching: Markdown parsed once per file (not 2-3 times)
- Derived data caching: Headings and links extracted once and cached
- Shared processor: Single unified processor instance for all parsing
- Automatic cleanup: Cache cleared between operations
- Memory efficient: ~6MB for 100 files, ~60MB for 1000 files
Impact: 2-3x overall speedup by reducing parse operations by 60-70%
- Uses cycle detection to prevent infinite loops
- Visits each file only once
- Shared cache eliminates redundant parsing during traversal
- Depth limiting optimization skips link extraction at maximum depth
- Parallel file validation with controlled concurrency (default: 10 concurrent operations)
- Shared cache eliminates redundant parsing during validation
- Promise pool prevents resource exhaustion on large documentation sets
- Skips external links (http/https)
- Uses
fs/promisesfor async I/O - Skips hidden directories and
node_modules - Minimal file reads via content caching
- unified - Markdown parsing and processing
- remark-parse - Markdown AST parser
- remark-lint - Markdown linting rules
- commander - CLI argument parsing
- cosmiconfig - Configuration file loading
- zod - Runtime schema validation
- chalk - Terminal colors
- globby - File pattern matching
- TypeScript - Type safety
- Vitest - Testing framework
- ESLint - Code linting
- Prettier - Code formatting
- Separation of Concerns: Clear boundaries between CLI, core logic, and utilities
- Type Safety: Comprehensive TypeScript types with runtime validation
- Testability: All components are independently testable
- Extensibility: Easy to add new rules, commands, and formats
- Error Handling: Rich error context with user-friendly messages
- Configuration: Flexible multi-layer configuration system
- Performance: Async operations with minimal file I/O
- Unix Philosophy: Pipe-friendly, proper exit codes, stdout/stderr separation
- Data to stdout, messages to stderr
- TTY detection for automatic color control
- Standard exit codes (0/1/2/130)
- Graceful signal handling (SIGINT, SIGTERM, SIGPIPE)
- Quiet mode for scripting
- Respects
NO_COLORandFORCE_COLORenvironment variables
src/
├── cli.ts # CLI entry point with signal handlers
├── index.ts # Main executable
├── commands/ # CLI commands
│ ├── lint.ts # Validation command
│ ├── deps.ts # Dependency analysis
│ ├── config.ts # Config management
│ └── init.ts # Config initialization
├── core/ # Business logic
│ ├── doc-linter.ts # Main orchestrator
│ ├── graph-analyzer.ts # Graph foundation
│ ├── link-validator.ts # Link validation
│ ├── markdown-cache.ts # Centralized parsing cache
│ ├── config-manager.ts # Config loading
│ ├── remark-engine.ts # Content linting
│ └── reporter.ts # Result formatting
├── types/ # Type definitions
│ ├── config.ts # Config schemas (Zod)
│ ├── graph.ts # Graph structure
│ ├── results.ts # Lint results
│ ├── errors.ts # Error types
│ └── exit-codes.ts # Unix exit codes enum
└── utils/ # Shared utilities
├── logger.ts # Unix-friendly logging
├── errors.ts # Custom error classes
├── error-handler.ts # Error handling
├── fs.ts # File system ops
├── paths.ts # Path resolution
├── slug.ts # Heading slugification
├── reporter.ts # Output formatting
└── dependency-reporter.ts # Dependency formatting
Potential areas for expansion:
mdite query: Search across documentation system- Full-text search across connected docs
- Pattern matching on file names
- Metadata/frontmatter queries
mdite cat: Output documentation content- Pipe to shell tools
- Order by dependency graph
- Filter and concatenate
mdite toc: Generate table of contents from graphmdite stats: Documentation metrics and analysis- External link validation: Check HTTP/HTTPS URLs (with caching)
- Watch mode: Monitor files and re-lint on changes
- Plugin System: Allow external plugins for custom rules
- Fix Mode: Automatically fix certain issues
- LSP Server: Language server protocol for editor integration
- Custom Reporters: Allow custom output formatters
- Configuration Presets: Shareable configuration packages