Skip to content

[research] ML-Based Recommendation of Krkn Chaos Scenarios #76

@rh-rahulshetty

Description

@rh-rahulshetty

ML-Based Recommendation of Krkn Chaos Scenarios

Description:

Kubernetes clusters continuously emit telemetry data that captures system behavior under varying workloads and conditions. While this data is commonly used for monitoring and alerting, it is rarely leveraged to proactively guide chaos engineering experiments.

This issue explores building a machine learning–based recommendation system that analyzes historical and recent telemetry data to suggest which Krkn chaos scenarios are most likely to be impactful for a given system state. Krkn provides a rich set of chaos scenarios, but selecting the right one often relies on human intuition and prior experience.

By learning patterns from telemetry data, an ML model can identify subtle signals and correlations that may indicate latent weaknesses—patterns that are difficult for humans to detect—and recommend targeted chaos scenarios accordingly. These recommendations can then be integrated with Krkn-AI to automatically execute or simulate the suggested experiments.

Expected Outcome

  • Train an ML-based model that consumes cluster telemetry data over time.
  • Generate ranked recommendations of Krkn chaos scenarios based on current system behavior.
  • Enable integration with Krkn-AI to execute or evaluate the suggested scenarios.

Why This Matters

Selecting the right chaos scenarios is critical to uncovering real system risks, yet it remains largely manual and intuition-driven. An ML-based recommendation system can make chaos engineering more proactive, data-driven, and scalable by continuously adapting to system behavior. This approach reduces guesswork, increases experiment relevance, and opens up new research opportunities—especially when combined with language-model-based predictors inspired by recent work on using regression language models to simulate and reason about large systems.

Recommended Skills:

Python, Data Science, Machine Learning and Deep Learning (pandas, numpy, sklearn, PyTorch)

Mentors:

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions