Skip to content

[C4.3] Add code complexity metrics to ARCHITECTURE.md #241

@yusufkaraaslan

Description

@yusufkaraaslan

Problem

ARCHITECTURE.md currently provides structural information (patterns, examples, directory layout) but lacks quantitative metrics about code quality and complexity.

Missing insights:

  • How complex is this codebase? (beginner vs expert)
  • What's the average function length?
  • How deeply nested is the code?
  • What's the test coverage estimation?
  • Which files are most complex (need refactoring)?

Proposed Solution

Add Section 9: Code Quality Metrics to ARCHITECTURE.md with 5 key metrics:

1. Cyclomatic Complexity

What: Measures code decision paths (if/else/for/while)
Why: Indicates testing difficulty and potential bugs

def calculate_cyclomatic_complexity(file_path):
    # Count decision points: if, elif, while, for, except, with
    complexity = 1  # Base complexity
    complexity += code.count('if ')
    complexity += code.count('elif ')
    complexity += code.count('while ')
    # ... etc
    return complexity

Output:

**Cyclomatic Complexity:**
- Average: 4.2 (Low - maintainable)
- Median: 3
- Max: 18 (in src/parser.py:152)
- Distribution: 85% simple (1-5), 12% moderate (6-10), 3% complex (11+)

2. Lines of Code (LoC) Distribution

What: Average function/class size
Why: Indicates code modularity

**Code Size Metrics:**
- Average function length: 12 lines
- Average class length: 85 lines
- Largest function: 156 lines (needs refactoring)
- Functions > 50 lines: 8 (5% of total)

3. Nesting Depth

What: Maximum indentation levels
Why: Deep nesting = harder to understand

**Nesting Depth:**
- Average: 2.1 levels
- Max: 6 levels (in src/validator.py - consider refactoring)

4. Test Coverage Estimation

What: Ratio of test files to source files
Why: Indicates testing rigor

**Test Coverage Estimation:**
- Test files: 178
- Source files: 361
- Test-to-source ratio: 0.49 (Good - near 1:1)
- Estimated coverage: ~70-80% (based on file ratio)

5. Code Duplication Detection

What: Similar code blocks across files
Why: Indicates need for abstraction

**Code Duplication:**
- Duplicate blocks found: 12
- Duplication ratio: 3.2%
- Largest duplicate: 45 lines across 3 files

Implementation

Core Function

# In codebase_scraper.py
def analyze_code_metrics(directory: Path) -> Dict[str, Any]:
    metrics = {
        'complexity': [],
        'function_lengths': [],
        'nesting_depths': [],
        'duplicates': []
    }
    
    for file in source_files:
        # Parse with AST (Python) or regex (others)
        ast_tree = ast.parse(file.read_text())
        
        # Calculate metrics
        metrics['complexity'].append(calculate_complexity(ast_tree))
        metrics['function_lengths'].append(get_function_sizes(ast_tree))
        metrics['nesting_depths'].append(max_nesting_depth(ast_tree))
    
    return {
        'avg_complexity': statistics.mean(metrics['complexity']),
        'avg_function_length': statistics.mean(metrics['function_lengths']),
        # ... etc
    }

Integration with ARCHITECTURE.md

# In unified_skill_builder.py
def _generate_architecture_overview():
    # ... existing sections 1-8 ...
    
    # Section 9: Code Quality Metrics (NEW)
    if c3_data.get('code_metrics'):
        f.write("## 9. Code Quality Metrics\n\n")
        self._write_code_metrics(f, c3_data['code_metrics'])

Supported Languages

Phase 1 (AST-based):

  • Python (using ast module)

Phase 2 (Heuristic-based):

  • JavaScript/TypeScript (regex + indentation counting)
  • Java, C++, Go (regex patterns)

Success Criteria

  • ✅ Calculates all 5 metrics for Python codebases
  • ✅ Performance: < 500ms for 500 files
  • ✅ Accuracy: Within 10% of established tools (pylint, radon)
  • ✅ Section 9 adds 20-30 lines to ARCHITECTURE.md
  • ✅ Helps developers identify refactoring targets

Priority

Medium - Nice-to-have enhancement, not critical

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestroadmap:H1.3Roadmap task H1.3: Create example project folder

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions