Skip to content

[Feature] Log Pattern Detection & Auto-Clustering #21

@Polliog

Description

@Polliog

Automatically detect and cluster similar log messages to identify recurring patterns and anomalies.

Acceptance Criteria:

Pattern Detection:

  • Background job: analyze logs every hour (configurable interval)
  • Detect similar log messages using edit distance or TF-IDF
  • Group logs into patterns (e.g., "API error 500" pattern)
  • Store patterns in log_patterns table with frequency count
  • Alert on new patterns (previously unseen error patterns)

Pattern Explorer UI:

  • List all detected patterns, sorted by frequency
  • Show pattern template (e.g., "User {id} failed login attempt")
  • Click pattern → drill down to matching logs
  • Chart: pattern frequency over time
  • Filter: show only new patterns (last 24h)

Configuration:

  • Environment variable: PATTERN_DETECTION_ENABLED (default: false)
  • Environment variable: PATTERN_DETECTION_INTERVAL (default: 3600 seconds)
  • Admin UI: enable/disable pattern detection per project

Technical Notes:

  • Use Levenshtein distance or TF-IDF for similarity detection
  • Consider using existing libraries like string-similarity or natural
  • Limit to 1000 most recent logs per analysis to avoid performance issues

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions