feat: enhance random run with attach/detach, status, log-dir, and abort#128
feat: enhance random run with attach/detach, status, log-dir, and abort#128NETIZEN-11 wants to merge 5 commits intokrkn-chaos:mainfrom
Conversation
Introduce --detach / -d flag on 'krknctl random run'. Default is foreground (attached). When --detach is set the binary re-executes itself as a background child process via os.Executable + exec.Command, strips --detach from the forwarded argv in all token forms (--detach, -d, --detach=true, -d=true), and returns immediately printing the PID. - Extract runRandomPlan() as the shared execution core for both modes - Add runDetached() with correct argv stripping and no unused parameters - Register --detach / -d flag in root.go - Thread logDir through RunGraph interface and CommonRunGraph so the log-dir feature works end-to-end without a second interface change - Update docker, podman, graph command, and test call sites for the new 9-argument RunGraph signature Signed-off-by: Nitesh <nitesh@example.com>
Introduce 'krknctl random status' to inspect a running chaos plan. - Add pkg/randomstate with SaveState / LoadState / ClearState helpers that persist a JSON state file at os.TempDir()/.krknctl/random_run.json (portable across Linux, macOS, Windows - no hardcoded /tmp path) - State written by runRandomPlan at startup, cleared on exit via defer - isProcessAlive() uses syscall.Signal(0) to probe liveness without delivering an actual signal; os.Signal(nil) is a nil interface and must not be used here - Status output: running (true/false), scenario name, plan-file path, PID, start time, log-dir when set - Stale state (process dead) is detected and auto-cleared - ScenarioName stores filepath.Base(planFile) for a clean display name Signed-off-by: Nitesh <nitesh@example.com>
Add --log-dir flag to 'krknctl random run'. - Directory is created automatically with os.MkdirAll if it does not exist (permissions 0750) - CommonRunGraph writes each scenario log file into logDir via path.Join(logDir, containerName+'.log') instead of the current working directory when the flag is provided - The full resolved log file path is reported in error messages and stored in state so 'random status' can display it accurately - When --log-dir is omitted behaviour is identical to before Signed-off-by: Nitesh <nitesh@example.com>
Introduce 'krknctl random abort' to immediately stop a running chaos plan. - Reads PID from the state file written by runRandomPlan - Calls process.Kill() on the stored PID - Clears the state file regardless of kill outcome so no stale state is left behind - Prints confirmation with PID and scenario name on success - Gracefully handles the case where no plan is running (no state file) Signed-off-by: Nitesh <nitesh@example.com>
Review Summary by QodoAdd lifecycle management for random chaos runs: detach, status, abort, log-dir
WalkthroughsDescription• Add detach/attach mode for background chaos execution with PID tracking • Introduce status command to inspect running chaos plan details • Add abort command to terminate running chaos scenarios cleanly • Implement custom log directory support with automatic creation • Create randomstate package for cross-process state persistence Diagramflowchart LR
A["krknctl random run"] -->|--detach| B["runDetached"]
B -->|re-exec with --attach| C["Child Process"]
C -->|runRandomPlan| D["SaveState"]
D -->|persist to tmpdir| E["randomstate"]
A -->|foreground| F["runRandomPlan"]
F -->|execute chaos| G["RunGraph"]
G -->|logDir| H["Custom Log Directory"]
I["krknctl random status"] -->|LoadState| E
J["krknctl random abort"] -->|LoadState| E
J -->|kill PID| K["Terminate Process"]
K -->|ClearState| E
File Changes1. cmd/random.go
|
Code Review by Qodo
1.
|
Implement client-side validation mode for 'krknctl run' that validates scenario configuration without requiring cluster access, kubeconfig, or any Kubernetes API calls. Changes: - cmd/dryrun.go: new file with parseDryRunFlag(), validateScenarioLocally(), and DryRunResult with coloured Print() output - cmd/dryrun_test.go: 13 unit tests covering flag parsing, required fields, type validation, global fields, nil safety, and multiple errors - cmd/run.go: PreRunE skips registry fetch in dry-run mode; RunE fetches scenario metadata from image registry only (no cluster calls), runs validateScenarioLocally(), prints results, returns error on failure - cmd/root.go: register --dry-run flag on runCmd with help text - cmd/random.go: fix RunGraph call to pass logDir (9-arg signature), fix runDetached() signature, fix isProcessAlive() to use syscall.Signal(0) instead of os.Signal(nil) Exit codes: 0 valid, non-zero on validation errors Output format matches issue spec (checkmark/cross symbols) Signed-off-by: Nitesh <nitesh@example.com>
Related Issue
Closes #126
#Summary
Adds full lifecycle management capabilities to
krknctl random run <plan.json>, improving control, observability, and flexibility for chaos scenarios.Changes
1. Attach / Detach Support
--detach(-d) flag to run chaos scenarios in backgroundRunGraphinterface to includelogDirparameter across docker, podman, and tests2. Status Command
Introduced new command:
Added new package
pkg/randomstatefor state management:SaveStateLoadStateClearStateState stored at:
Displays:
3. Custom Log Directory
--log-dir <path>flag4. Abort Command
Introduced:
Terminates running chaos scenario using PID from state file
Cleans up state file after termination
Why this change is needed
These enhancements provide:
#Testing
Checklist