Skip to content

GREP: kubectl-grove CLI Plugin #373

@athreesh

Description

@athreesh

Summary

Proposal for kubectl-grove, a kubectl plugin that provides a rich interaction layer for Grove workloads on Kubernetes. The CLI bridges the gap between raw kubectl commands and the complex, hierarchical nature of Grove resources (PodCliqueSets, PodGangs, PodCliques, Pods), offering both command-line tools and an interactive Terminal User Interface (TUI) called Arborist.

Motivation

Managing distributed AI/ML workloads with Grove involves understanding complex resource hierarchies and placement topologies. Users currently need to:

  1. Run multiple kubectl get commands to understand the state of their deployment
  2. Manually correlate PodCliqueSets → PodGangs → PodCliques → Pods relationships
  3. Lack visibility into GPU allocation, topology placement, and fragmentation
  4. Have no intuitive way to visualize how pods are distributed across racks and nodes

Proposed Features

Critical (Must Have)

  • Arborist TUI (kubectl grove tui) - Hierarchical navigation with real-time refresh and embedded topology view
  • Topology Command (kubectl grove topology) - Rack → Node → Pod visualization with GPU allocation bars

High Priority

  • kubectl grove status - PodCliqueSet status with progress visualization
  • kubectl grove health - Gang-aware health dashboard
  • kubectl grove diagnostics - Comprehensive diagnostic data collection

Medium Priority

  • Lifecycle commands: rollout, scale, update, restart, apply
  • kubectl grove metrics - Live metrics from pod endpoints

/kind feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestkind/enhancementCategorizes issue or PR as related to a new feature, enhancement or improvement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions