Skip to content

Conversation

@athreesh
Copy link

What type of PR is this?

/kind feature
/kind documentation

What this PR does / why we need it:

Introduces GREP-373, a proposal for kubectl-grove - a kubectl plugin that provides a rich interaction layer for Grove workloads on Kubernetes.

The CLI bridges the gap between raw kubectl commands and the complex, hierarchical nature of Grove resources (PodCliqueSets, PodGangs, PodCliques, Pods), offering both command-line tools and an interactive Terminal User Interface (TUI) called Arborist.

Proposed Features

Critical (Must Have)

  • Arborist TUI (kubectl grove tui) - Hierarchical navigation with real-time refresh and embedded topology view
  • Topology Command (kubectl grove topology) - Rack → Node → Pod visualization with GPU allocation bars

High Priority

  • kubectl grove status - PodCliqueSet status with progress visualization
  • kubectl grove health - Gang-aware health dashboard
  • kubectl grove diagnostics - Comprehensive diagnostic data collection

Medium Priority

  • Lifecycle commands: rollout, scale, update, restart, apply
  • kubectl grove metrics - Live metrics from pod endpoints

Which issue(s) this PR fixes:

Fixes #373

Special notes for your reviewer:

This is a proposal document (GREP) following the template from #362. Looking for feedback on:

  • Command priorities and phasing
  • TUI design and navigation model
  • Any missing use cases

Does this PR introduce an API change?

NONE

Additional documentation:

Adds GREP-373 proposal for kubectl-grove CLI plugin.

🤖 Generated with Claude Code

athreesh and others added 2 commits October 19, 2025 15:05
## Motivation
During hands-on testing of the Grove installation process, several critical
usability issues were discovered that would block new users from successfully
deploying Grove. Additionally, the README was too verbose and didn't quickly
communicate the core value proposition to developers evaluating the project.

## Changes Made

### installation.md - Fixed Critical Blockers

**Working Directory Confusion**
- Added explicit "Navigate to operator directory" instructions
- Impact: Users can now follow the guide linearly without trial-and-error

**KUBECONFIG Setup Broken**
- kind-up script has a bug and doesn't export KUBECONFIG properly
- Added manual workaround using `kind get kubeconfig`
- Impact: Users can now successfully deploy after creating kind cluster

**Wrong Resource Names**
- Fixed: simple1-0-pcsg → simple1-0-sga (actual resource name)
- Impact: Scaling examples now work as documented

**Added Troubleshooting Section**
- Covers deployment issues, runtime issues, and community resources
- Impact: Users can self-serve when encountering common issues

### README.md - Refocused on Problem → Solution → Action

**Shortened from ~80 lines to ~40 lines of core content**

New structure:
1. Problem First: What's broken in K8s for AI inference
2. Solution: Grove's one-liner positioning
3. Quick Start: 4 commands to deploy in 5 minutes
4. What Grove Solves: Table mapping scenarios to capabilities
5. How It Works: Simplified concept explanations

Roadmap simplified to Q4 2025 / Q1 2026 (removed specific outdated dates)

Impact: Users understand value prop in 30 seconds and can start immediately

### quickstart.md - New 10-Minute Tutorial

- Explains the 4-component example architecture
- Step-by-step deployment with expected outputs
- Demonstrates both PCSG and PCS scaling
- Includes hierarchy visualization
- Kind-specific troubleshooting tips

Impact: New users get immediate success experience in 10 minutes

## Testing Performed
All changes validated through fresh kind cluster deployment on macOS,
following installation.md step-by-step, and verifying all examples work.

Co-authored-by: Claude <[email protected]>
Introduces a proposal for kubectl-grove, a kubectl plugin providing:
- Arborist TUI for hierarchical resource navigation
- Topology visualization command
- Status, health, and diagnostics commands
- Lifecycle management commands (rollout, scale, etc.)

Fixes ai-dynamo#373

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@athreesh
Copy link
Author

@gflarity for viz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GREP: kubectl-grove CLI Plugin

1 participant