Skip to content

[Feature] Service Dependency Graph & Correlation Analysis #40

@Polliog

Description

@Polliog

Feature Description

Visualize service dependencies based on log co-occurrence and distributed trace data to help teams understand their system architecture and identify bottlenecks. The feature will automatically analyze which services communicate with each other and display them in an interactive graph.

Problem/Use Case

Problem:

  • Teams struggle to understand complex microservice architectures
  • Service dependencies are not documented or outdated
  • Debugging cross-service issues requires manual investigation
  • No visibility into which services are frequently communicating

Use Cases:

  • Architecture Understanding: New team members need to quickly grasp system structure
  • Debugging: When service A fails, identify all dependent services that might be affected
  • Performance Analysis: Find bottlenecks by identifying services with high call volumes
  • Migration Planning: Understand service relationships before refactoring

Proposed Solution

Build an automated service dependency visualization that:

  1. Dependency Analysis:

    • Analyze log co-occurrence (services that log together within same time window)
    • Extract service calls from distributed traces (span → span relationships)
    • Build directed graph: service A → service B
    • Calculate edge weights based on call frequency
  2. Interactive Service Map UI:

    • D3.js force-directed graph visualization
    • Node size represents log volume per service
    • Edge thickness represents call frequency
    • Color coding for service health (green = healthy, red = errors)
    • Click service to show details and filter logs
  3. Correlation Features:

    • Click service A → highlight all dependent services
    • Show correlation score (how often services log together)
    • Time-based view (dependencies for specific time range)
    • Export graph as image or JSON

Alternatives Considered

  1. Manual Documentation: Requires constant maintenance, quickly becomes outdated
  2. Static Architecture Diagrams: Don't reflect actual runtime behavior
  3. Third-party APM Tools: Expensive ($500-2000/month), data leaves infrastructure
  4. Graph Database (Neo4j): Adds complexity, overkill for most use cases

Why our approach is better:

  • Automated analysis based on actual runtime data (traces + logs)
  • No manual configuration required
  • Free and integrated directly into LogWard
  • Privacy-first (data stays in your infrastructure)

Implementation Details (Optional)

Priority

  • Critical - Blocking my usage of LogWard
  • High - Would significantly improve my workflow
  • Medium - Nice to have
  • Low - Minor enhancement

Justification: Valuable for debugging and understanding systems, but not blocking core functionality.

Target Users

  • DevOps Engineers
  • Developers
  • Security/SIEM Users
  • System Administrators
  • All Users

Additional Context

Contribution

  • I would like to work on implementing this feature

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions