Skip to content

Commit dd58ab4

Browse files
docs: initialize project
Comprehensive fix pass on Go operator rewrite (PR #274) — addressing all 9 critical, ~25 major, and 50+ minor issues from 8 expert reviewers, plus repo restructuring.
1 parent f5d9593 commit dd58ab4

File tree

1 file changed

+111
-0
lines changed

1 file changed

+111
-0
lines changed

.planning/PROJECT.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Locust K8s Operator — Go Rewrite Polish
2+
3+
## What This Is
4+
5+
A comprehensive fix pass on the Go rewrite of the Locust Kubernetes Operator (PR #274, `feat/go` -> `master`). The operator manages distributed Locust load tests on Kubernetes via a LocustTest CRD. Eight expert reviewers identified 9 critical, ~25 major, and 50+ minor issues. This project addresses every single one, moves the Go code from `locust-k8s-operator-go/` to repo root, and removes the legacy Java operator code — making the PR merge-ready and production-perfect.
6+
7+
## Core Value
8+
9+
Every API field the operator exposes must actually work, and the operator must never crash or silently ignore user configuration.
10+
11+
## Requirements
12+
13+
### Validated
14+
15+
- ✓ Kubernetes operator using controller-runtime v0.21.0 — existing
16+
- ✓ LocustTest CRD with v2 (storage) and v1 (deprecated) API versions — existing
17+
- ✓ Conversion webhooks for v1 <-> v2 — existing
18+
- ✓ Validation webhooks for CR creation/update — existing
19+
- ✓ Master/worker Job-based test execution — existing
20+
- ✓ Status subresource with phase tracking (Pending/Running/Succeeded/Failed) — existing
21+
- ✓ Owner references for automatic garbage collection — existing
22+
- ✓ OpenTelemetry support for native Locust metrics export — existing
23+
- ✓ Volume mounting with target filtering — existing
24+
- ✓ Environment variable and secret injection — existing
25+
- ✓ Helm chart with webhook support and security hardening — existing
26+
- ✓ Comprehensive test pyramid (~150 unit, ~20 integration, ~30 E2E) — existing
27+
- ✓ CI/CD pipelines for build, test, and release — existing
28+
- ✓ Multi-arch container images via ko — existing
29+
- ✓ Documentation with migration guide, API reference, and developer docs — existing
30+
31+
### Active
32+
33+
- [ ] Wire `extraArgs` into master/worker command builders (C1)
34+
- [ ] Wire CR-level `resources` with fallback to operator config (C2)
35+
- [ ] Replace `resource.MustParse()` with safe parsing and error handling (C3)
36+
- [ ] Unify volume name constants across packages (C4)
37+
- [ ] Fix `ko build` path in release pipeline (C5)
38+
- [ ] Remove fragile `Generation > 1` reconciliation guard (C6)
39+
- [ ] Fix Helm CRD conversion webhook architecture (C7)
40+
- [ ] Verify/fix `readOnlyRootFilesystem` with webhook cert writes (C8)
41+
- [ ] Keep `failurePolicy: Fail` but ensure operator availability during upgrades (C9)
42+
- [ ] Remove dead code: `isJobComplete()`/`isJobFailed()` (M1)
43+
- [ ] Fix mount path trailing slash inconsistency (M2)
44+
- [ ] Fix error message typo + duplicate logging in main (M3)
45+
- [ ] Align `ko` action versions across workflows (M4)
46+
- [ ] Add `timeout-minutes` to all CI/CD jobs (M5)
47+
- [ ] Fix `ct.yaml` timeout (600s -> reasonable value) (M6)
48+
- [ ] Address chart-releaser fork supply chain risk (M7)
49+
- [ ] Add `permissions:` block to `go-test-e2e.yml` (M8)
50+
- [ ] Pin Kind download with checksum verification (M9)
51+
- [ ] Replace `go mod tidy` mutation with verification check (M10)
52+
- [ ] Remove nested `.github/workflows/` under Go subdir (M11)
53+
- [ ] Fix `imagePullSecrets` format in v2 doc examples (M12)
54+
- [ ] Add missing `image` field in getting started example (M13)
55+
- [ ] Fix status example with non-existent fields (M14)
56+
- [ ] Fix operator CPU limit mismatch in migration docs (M15)
57+
- [ ] Fix wrong working directory in dev docs (M16)
58+
- [ ] Fix namespace inconsistency across docs (M17)
59+
- [ ] Add CRD conversion webhook config to Helm chart (M18)
60+
- [ ] Tighten Helm RBAC (remove unnecessary ConfigMap/Secret CRUD) (M19)
61+
- [ ] Add recovery when owned resources are externally deleted (M20)
62+
- [ ] Fix backward compatibility helpers in Helm templates (M21)
63+
- [ ] Remove or wire `crd.install` value (M22)
64+
- [ ] Fix `replicaCount` default (2 -> 1 for backward compat) (M23)
65+
- [ ] Fix CRD symlink for `helm package` in CI (M24)
66+
- [ ] Suppress Prometheus scrape annotations when OTel enabled (M25)
67+
- [ ] Fix E2E conversion script label selectors (M26)
68+
- [ ] Fix `os.Chdir()` global side effect in test utils (M27)
69+
- [ ] All 56 minor issues from reviewer feedback (see pr_review.md Minor Issues section)
70+
- [ ] Move Go code from `locust-k8s-operator-go/` to repo root
71+
- [ ] Remove legacy Java operator code
72+
73+
### Out of Scope
74+
75+
- New features beyond what reviewers identified — this is a polish pass, not feature work
76+
- Kubernetes version upgrades or dependency bumps beyond what's needed for fixes
77+
- Redesigning the operator architecture — it's already approved by reviewers
78+
- Python locust test files or example load test scripts
79+
80+
## Context
81+
82+
- **Origin**: Complete rewrite from Java (Micronaut + JOSDK) to Go (controller-runtime)
83+
- **PR #274**: `feat/go` -> `master`, 238 files changed, +56,350 / -959 lines
84+
- **Review team**: 8 expert reviewers (Architecture, Go, GitHub Actions, Documentation, Kubernetes, Locust, Helm, Testing)
85+
- **Aggregate verdict**: REQUEST CHANGES — architecturally sound, targeted fixes needed
86+
- **Existing test coverage**: ~150 unit, ~20 integration, ~30 E2E scenarios
87+
- **Codebase location**: Currently in `locust-k8s-operator-go/` subdirectory, moving to repo root
88+
- **Individual reviews**: Available in `reviews/` directory for detailed context per domain
89+
- **Compiled review**: `pr_review.md` has the full prioritized issue list
90+
91+
## Constraints
92+
93+
- **Branch**: All work on `feat/go` branch — PR must stay mergeable to `master`
94+
- **No breaking API changes**: v2 CRD API surface is established; wire existing fields, don't redesign
95+
- **Webhook policy**: Keep `failurePolicy: Fail` per maintainer decision — ensure operator HA instead
96+
- **Backward compatibility**: v1 API must continue to work through conversion webhooks
97+
- **Test pyramid**: Maintain existing test coverage; add tests for new behavior (wired fields, error handling)
98+
- **Review alignment**: Fixes must match reviewer recommendations — individual review files in `reviews/` are the source of truth
99+
100+
## Key Decisions
101+
102+
| Decision | Rationale | Outcome |
103+
|----------|-----------|---------|
104+
| Wire `extraArgs` and `resources` (not remove) | Users expect API fields to work; removing would be a breaking change | — Pending |
105+
| Keep `failurePolicy: Fail` for webhooks | Strict validation is more important than availability during operator downtime | — Pending |
106+
| Move Go code to repo root | Operator is the primary codebase now; subdirectory adds unnecessary complexity | — Pending |
107+
| Remove Java code | Go rewrite replaces it entirely; keeping it adds confusion | — Pending |
108+
| All review items in one pass | Comprehensive fix before merge; no tech debt carried forward | — Pending |
109+
110+
---
111+
*Last updated: 2026-02-06 after initialization*

0 commit comments

Comments
 (0)