Skip to content

Address all 86 review items from Go operator rewrite PR#275

Merged
AbdelrhmanHamouda merged 58 commits intofeat/gofrom
001-go-operator-manual-test-plan
Feb 10, 2026
Merged

Address all 86 review items from Go operator rewrite PR#275
AbdelrhmanHamouda merged 58 commits intofeat/gofrom
001-go-operator-manual-test-plan

Conversation

@AbdelrhmanHamouda
Copy link
Owner

Summary

Comprehensive fix pass addressing every issue identified by 8 expert reviewers on the Go operator rewrite (PR #274). This PR resolves all 9 critical, 25 major, and 52 minor issues across 7 phases — making the operator merge-ready and production-correct.

Phase 1: Critical API Wiring

  • Wire extraArgs into master/worker command builders with conflict detection
  • Wire CR-level resources with 3-level precedence (CR > role-specific Helm > unified Helm)
  • Replace resource.MustParse() with safe parsing and startup validation

Phase 2: Kubernetes Resources & Webhooks

  • Remove incorrect MutatingWebhookConfiguration for CRD conversion
  • Add emptyDir /tmp volume for readOnlyRootFilesystem compatibility
  • Tighten RBAC to least privilege, fix backward compatibility helpers
  • Add PodDisruptionBudget template, fix CRD symlink, fix replicaCount default

Phase 3: Core Operator Stability

  • Unify volume name constants, fix mount path trailing slashes
  • Remove Generation > 1 guard (phase-based reconciliation)
  • Suppress Prometheus annotations when OTel enabled
  • Add self-healing recovery for externally deleted owned resources

Phase 4: CI/CD Security & Reliability

  • Fix ko build path, align action versions, add timeouts to all jobs
  • Replace chart-releaser fork with upstream helm/chart-releaser-action
  • Pin Kind with helm/kind-action, add go mod tidy verification
  • Parallelize CI jobs, upload artifacts on failure

Phase 5: Documentation Accuracy

  • Fix imagePullSecrets format, status examples, CPU limit mismatch
  • Standardize namespaces, fix working directory instructions
  • Update OTel exporter config, copyright years, framework references

Phase 6: Testing & Polish

  • Add typed Phase enum, ObservedGeneration tracking, phase transition events
  • Fix os.Chdir() side effect, E2E label selectors, add webhook/boundary tests
  • Add gosec, errorlint, exhaustive linters; wrap reconciler errors
  • Add Helm schema validation, NOTES.txt, health probes, Kafka gating
  • Refactor main.go, pin Dockerfile, add .dockerignore, clean up RBAC

Phase 7: Repository Restructure

  • Remove legacy Java operator code (archived to archive/java-operator-v1 branch)
  • Move Go operator from locust-k8s-operator-go/ to repository root
  • Update all CI/CD workflows, Makefile, config manifests, E2E tests
  • Add MIGRATION.md, SECURITY.md, .editorconfig

Stats

  • 86/86 review items addressed
  • 295 files changed (+8,123 / -36,759 lines)
  • 51 commits across 7 phases
  • Test coverage: config 100%, controller 94.9%, resources 97.4%

Test plan

  • go build ./cmd/main.go succeeds
  • make test — all unit/integration tests pass
  • helm lint charts/locust-k8s-operator passes
  • golangci-lint run — no blocking issues
  • E2E tests pass on Kind cluster
  • CRD v1 <-> v2 conversion works end-to-end
  • Helm install with default values succeeds

Add .windsurf/, .planning/, .specify/, specs/, CLAUDE.md, and GEMINI.md to gitignore to exclude AI-assisted development workflow files and specifications from version control.
Comprehensive fix pass on Go operator rewrite (PR #274) — addressing all 9 critical, ~25 major, and 50+ minor issues from 8 expert reviewers, plus repo restructuring.
…ction

- Add tests for BuildMasterCommand with extraArgs
- Add tests for BuildWorkerCommand with extraArgs
- Add tests for detectFlagConflicts function
- Add tests for conflict warning logging
- Update existing tests to pass new logger parameter
- Tests currently fail (RED phase) - functions not yet implemented
- Add operatorManagedFlags registry containing all operator-controlled flags
- Implement detectFlagConflicts function to identify conflicting extraArgs
- Wire extraArgs into BuildMasterCommand with conflict warning logging
- Wire extraArgs into BuildWorkerCommand with conflict warning logging
- Update BuildMasterJob and BuildWorkerJob to accept logger parameter
- Update controller to pass logger to job builders
- Update all test calls to pass logger parameter
- ExtraArgs are appended after operator-managed flags (POSIX last-wins)
- All tests now pass (GREEN phase complete)
RED Phase - Add tests for:
- LoadConfig() returns error for invalid resource quantities
- Multiple invalid values are collected and reported
- Valid resource quantities pass validation
- Empty resource strings are treated as optional

All existing LoadConfig() call sites updated to handle error return.
GREEN Phase - Implementation:

1. Config validation (config.go):
   - LoadConfig() now returns (*OperatorConfig, error)
   - validateResourceQuantities() checks all resource env vars
   - Invalid values produce clear error messages
   - Multiple invalid values collected and reported together
   - Empty strings treated as optional (not validated)

2. Safe parsing (job.go):
   - Replace resource.MustParse() with resource.ParseQuantity()
   - Safe because values pre-validated at startup
   - buildResourceList() now uses ParseQuantity with error ignored

3. Resource precedence (job.go):
   - buildResourceRequirementsWithPrecedence() implements CR → defaults
   - hasResourcesSpecified() helper distinguishes nil vs empty
   - CR resources are complete override (not partial merge)
   - buildLocustContainer() uses new precedence function

4. Main error handling (main.go):
   - Handles LoadConfig error, logs and exits with code 1
   - Clear error messages on startup failure

5. Test updates (suite_test.go):
   - Controller test suite handles new LoadConfig error return

All tests pass. No MustParse in production code.
- Add masterResources and workerResources to values.yaml (default empty)
- Add 12 helper templates in _helpers.tpl for role-specific resources
- Add conditional env vars in deployment.yaml (via envVars helper)
- Empty resources means 'use unified resources' (backward compatible)
- Helm lint passes, default rendering excludes role-specific env vars
- Add 7 integration tests for BuildMasterJob/BuildWorkerJob
- Test extraArgs appear in generated Job command
- Test CR resources override operator defaults
- Test empty CR resources fall back to defaults
- Test master/worker resources are independent
- All tests pass, verify full Phase 1 wiring works end-to-end
Add tests for 3-level resource precedence (CR > role-specific > unified):
- TestLoadConfig_RoleSpecificResourceDefaults: Verify 12 fields default to empty
- TestLoadConfig_RoleSpecificResourceOverrides: Verify env vars are loaded
- TestLoadConfig_InvalidRoleSpecificResource: Verify validation errors
- TestBuildMasterJob_WithHelmMasterResources: Verify master-specific overrides unified
- TestBuildWorkerJob_WithHelmWorkerResources: Verify worker-specific overrides unified
- TestBuildMasterJob_CROverridesHelmRoleSpecific: Verify CR wins over role-specific
- TestBuildMasterJob_HelmRoleSpecific_PrecedenceOverUnified: Verify field-level fallback

Tests fail to compile (RED phase) - fields don't exist yet.
…edence

Complete the three-level resource precedence chain:
- Level 1: CR-level resources (complete override, highest precedence)
- Level 2: Role-specific operator config (from Helm masterResources/workerResources)
- Level 3: Unified operator defaults (from Helm resources)

Config changes:
- Add 12 role-specific fields to OperatorConfig (6 master, 6 worker)
- LoadConfig reads MASTER_POD_* and WORKER_POD_* env vars with empty defaults
- validateResourceQuantities validates non-empty role-specific values

Job changes:
- buildResourceRequirementsWithPrecedence implements 3-level precedence
- Field-level fallback: empty role-specific → unified (backward compatible)
- CR resources remain complete override (not partial merge)

All tests pass (GREEN phase complete).
- Remove MutatingWebhookConfiguration block (lines 60-90)
- CRD conversion uses spec.conversion.webhook on CRD, not admission webhook
- Keep ValidatingWebhookConfiguration for CR validation (correct)
- Keep webhook Service for routing API server traffic
- Update comments to clarify conversion is handled by CRD spec
- controller-runtime serves /convert endpoint automatically
- Add tmp emptyDir volume at /tmp for controller-runtime temp files
- Webhook certs mount overlays at subdirectory /tmp/k8s-webhook-server/serving-certs
- readOnlyRootFilesystem: true now works with webhook server
- Change replicaCount default from 2 to 1 (backward compatible with Java operator)
- Update comment to recommend 2+ replicas with leader election for HA
- No volumes/mounts when webhooks disabled (clean default install)
- ConfigMaps/Secrets reduced to read-only (get, list, watch)
- Jobs reduced to immutable pattern (get, list, watch, create, delete)
- Services reduced to create/delete lifecycle (no update/patch)
- Leases conditional on leaderElection.enabled flag
- Leases verbs reduced (removed delete, kept get/list/watch/create/update/patch)
- Updated header comments to reflect read-only and immutable patterns
- Changed all helpers from checking parent keys to checking full path with leaf values
- Pod resources: Check .Values.locustPods.resources.requests.cpu exists (not just .Values.locustPods)
- Boolean helpers (affinity, tolerations): Use hasKey to handle false values correctly
- TTL helper: Check actual ttlSecondsAfterFinished value exists
- Metrics exporter: Check full nested path for image, port, pullPolicy, and resources
- Kafka: Check bootstrapServers and security.enabled with full paths
- Updated header comment to explain leaf-value checking precedence
- Enables true backward compat: old config paths work when new paths are absent
…stall

- Add PDB template (disabled by default per user decision)
- Add podDisruptionBudget config section to values.yaml
- Remove crd.install no-op value (Helm crds/ always installs unconditionally)
- PDB supports HA deployments with replicaCount >= 2
…patibility

- Remove symlink to ../../locust-k8s-operator-go/config/crd/bases
- Copy locust.io_locusttests.yaml directly into charts/crds/
- Fixes helm package in CI environments where symlink target may not resolve
…ng slash

- Changed libVolumeName from "locust-lib" to "lib" in webhook to match resources package
- Removed trailing slash from DefaultMountPath ("/lotest/src/" → "/lotest/src")
- Updated test to validate "lib" instead of "locust-lib"
- Both packages now use "lib" as the canonical lib volume name
…iliation

- Remove Generation > 1 NO-OP guard from reconciler
- Phase-based state machine now drives all reconciliation
- Pending phase creates resources regardless of generation number
- Add V(1) info message for generation > 1 (informational only)
- Rewrite tests to verify phase-based behavior
- Fixes operator-restart edge case where modified-but-unprocessed CRs would never get resources created
- Add IsOTelEnabled check to BuildAnnotations for master pods
- When OTel is enabled, Locust exports metrics natively via OTLP
- No sidecar container or Prometheus scrape annotations needed with OTel
- Pattern matches existing OTel suppression in service.go and job.go
- Add test TestBuildAnnotations_Master_NoPrometheusWhenOTelEnabled
- Rename existing test for clarity
- Check for missing Service, master Job, and worker Job during reconcileStatus
- Reset Phase to Pending when resources are externally deleted
- Emit Warning event with descriptive message
- Pending phase triggers createResources on next reconcile (self-healing loop)
- Standard controller-runtime requeue handles exponential backoff
- TestReconcile_ExternalDeletion_MasterService: verify Service deletion triggers recovery
- TestReconcile_ExternalDeletion_MasterJob: verify master Job deletion triggers recovery
- TestReconcile_ExternalDeletion_WorkerJob: verify worker Job deletion triggers recovery
- All tests verify Warning event emission, Phase reset to Pending, and self-healing recreation
- Full test suite passes (21/21 controller integration tests)
…, timeout

- Fix ko build path from ./cmd/main.go to ./cmd (CICD-01)
- Align ko version to @v0.7 matching ci.yaml (CICD-02)
- Remove unnecessary packages: write permission (CICD-12)
- Add timeout-minutes: 30 for job reliability (CICD-03)
- Replace askcloudarchitech fork with upstream helm/chart-releaser-action@v1 (CICD-05)
- Add needs: [publish-image] to helm-chart-release for proper ordering
- Add timeout-minutes: 15 to helm-chart-release (CICD-03)
- Add timeout-minutes: 15 to docs-release (CICD-03)
- Remove stale comments about upstream PR
- Keep docs-release independent for parallel execution
- Remove needs: dependencies from lint-test-helm and docs-test jobs
- Jobs now run in parallel (build-go, lint-test-helm, docs-test)
- Add timeout-minutes: 30 to build-go and lint-test-helm jobs
- Add timeout-minutes: 15 to docs-test job
- Increase ct.yaml helm timeout from 120s to 300s for operator deployments
- Upload Go test artifacts (cover.out) on build-go failure
- Upload kind cluster logs on lint-test-helm failure
- Upload docs build output (site/) on docs-test failure
- All artifacts retained for 7 days
- Add permissions: read-all for least privilege (CICD-06)
- Replace manual Kind download with helm/kind-action@v1.12.0 (CICD-07)
- Add go mod tidy verification check on go.mod and go.sum (CICD-08)
- Add timeout-minutes: 30 to prevent runaway jobs (CICD-03)
- Add artifact upload on failure for debugging (CICD-11)
- Remove locust-k8s-operator-go/.github/workflows/lint.yml
- Remove locust-k8s-operator-go/.github/workflows/test.yml
- Remove locust-k8s-operator-go/.github/workflows/test-e2e.yml

These workflows are dead code - GitHub Actions only reads from repo root
.github/workflows/. The nested workflows were leftover from when the Go
operator was a standalone repository. All functionality is covered by
root-level ci.yaml, go-test-e2e.yml, and release.yaml.
- Fix imagePullSecrets to use LocalObjectReference format (- name: gcr-secret)
- Reframe Kafka section to explain two-level config model
- Document operator-level centralized configuration approach
- Document per-test override capability
- Remove deprecated framing from Kafka documentation
- Correct metrics.secure comment to state default is false
- Replace deprecated jaeger exporter with otlphttp in OTel collector config
- Update exporter endpoint to http://jaeger-collector:4318 (OTLP HTTP)
- Change "Operator SDK" to "controller-runtime" in comparison table
- Removes ambiguity (Operator SDK could mean CLI tool or framework)
- controller-runtime is the actual framework used (verified in go.mod)
… limit

- helm_deploy.md: add --namespace locust-system --create-namespace to all install examples
- getting_started.md: add missing image field to v2 lib configmap example
- migration.md: fix operator CPU limit from 100m to 500m (matches values.yaml)

Fixes: DOCS-02, DOCS-04, DOCS-06
…uctions

- Remove all locust-k8s-operator-go/ subdirectory references across docs
- Update local-development.md with note about dev vs production namespaces
- Update integration-testing.md directory tree to show repo root structure
- Update pull-request-process.md to remove subdirectory path reference
- Simplify Kind cluster name in integration-testing.md

Fixes: DOCS-05, DOCS-12
- Remove os.Chdir from test/utils/utils.go (global side effect causing test interference)
- Fix E2E conversion script to use correct label selectors (performance-test-pod-name)
- Replace app=locust-master/worker with performance-test-pod-name=<name>-master/worker
…sition events

- Define Phase as typed enum (type Phase string) for compile-time safety
- Add ObservedGeneration field to LocustTestStatus for controller progress tracking
- Update derivePhaseFromJob to return typed Phase
- Add phase transition event recording (TestStarted, TestCompleted, TestFailed)
- Update ObservedGeneration on all status updates in controller
- Document ConnectedWorkers as approximation from Job.Status.Active
- Remove unused RestartPolicyNever constant (code uses corev1.RestartPolicyNever)
- Remove unused LocustContainerName constant
- Add Chart.yaml metadata: kubeVersion, home, sources, maintainers
- Add fullnameOverride and nameOverride support in values and _helpers.tpl
- Add extraEnv support for custom operator environment variables
- Add terminationGracePeriodSeconds configuration
- Add configurable podSecurityContext and containerSecurityContext
- Add health_check extension to OTel Collector config
- Create values.schema.json for Helm value validation
- Create NOTES.txt with post-install instructions and examples
…ability

- Add terminationGracePeriodSeconds to deployment from values
- Use configurable podSecurityContext and containerSecurityContext
- Add extraEnv support to deployment env section
- Gate Kafka environment variables behind kafka.enabled condition
- Add ConfigMap checksum annotation to OTel Collector for auto-rollout
- Add liveness and readiness HTTP probes on port 13133 to OTel Collector
- Add health port (13133) to OTel Collector container ports
- Add TestValidateUpdate_Invalid for webhook update validation (secret mounts, volumes, OTel)
- Add TestValidateCreate_LongCRName boundary test for CR name length validation
- Add validateCRName function to enforce 63-char limit on generated resource names
- Create locusttest-with-scheduling.yaml E2E fixture with affinity and tolerations
- Add LoadV2Fixture/MustLoadV2Fixture functions for v2 test fixtures
- Create locusttest_v2_full.json comprehensive v2 fixture
…type in tests

- Add TestUpdateStatusFromJobs_FullStateMachine covering all phase transitions
- Add TestUpdateStatusFromJobs_PhaseTransitionEvents verifying event emission
- Add TestUpdateStatusFromJobs_NoEventOnSamePhase for idempotency verification
- Add TestUpdateStatusFromJobs_ObservedGeneration verifying generation tracking
- Add TestDerivePhaseFromJob_TypeSafety verifying typed Phase return
- Add TestUpdateStatusFromJobs_WorkersConnectedCondition for worker tracking
- Fix existing tests to use typed Phase (string conversion for comparisons)
- Update CRD manifests for ObservedGeneration field
… tests

- Remove unused logf import and blank identifier assignment in api/v1/locusttest_webhook.go
- Add comprehensive OTel integration tests in internal/resources/env_test.go
  - TestBuildEnvVars_WithOTel_Enabled: verifies OTel env vars correctly merged with Kafka and user vars
  - TestBuildEnvVars_WithOTel_Disabled: verifies no OTel vars when disabled
  - TestBuildEnvVars_WithOTel_NoObservabilityConfig: verifies no OTel vars when config absent
  - TestBuildEnvVars_OTel_HTTPProtocol: verifies HTTP protocol support
  - TestBuildEnvVars_OTel_EnvVarOrder: verifies correct precedence (Kafka, OTel, user)

Note: PDB template was already implemented in a previous phase
…iler errors

- Add gosec, errorlint, exhaustive linters to .golangci.yml
- Configure errorlint with all checks enabled
- Configure exhaustive with strict mode (default-signifies-exhaustive: false)
- Add error wrapping with %w to all reconciler error returns
- Add PhasePending case to switch statement (exhaustive requirement)
- Fix gofmt formatting in config.go
- Add justified nolint:gosec directives for test code with known safe inputs
…string helpers

- Move BuildKafkaEnvVars function from job.go to env.go for logical grouping
- Move strconv import from job.go to env.go (only used by BuildKafkaEnvVars)
- Replace custom contains()/containsHelper() with stdlib strings.Contains()
- Add strings import to controller unit test
- Extract parseFlags helper for command-line flag parsing
- Extract configureTLS helper for TLS options
- Extract setupWebhookServer helper for webhook configuration
- Extract setupMetricsServer helper for metrics configuration
- Extract setupControllers helper for controller registration
- Extract setupHealthChecks helper for health probes
- Remove nolint:gocyclo directive (no longer needed)
- Change leader election ID to locust-k8s-operator.locust.io (CORE-27)
- Update PROJECT file to track v2 API (CORE-21)
- Pin Dockerfile golang version to 1.24.0 (CORE-23)
- Add comprehensive .dockerignore for build optimization (CORE-18)
- Add cover.out, coverage.out, venv/ to .gitignore (CORE-20)
- Remove create/delete verbs from locusttests RBAC (CORE-28)
- Remove finalizers permission from RBAC (CORE-29)
- Update kubebuilder RBAC markers in controller
- Add map ordering comment to buildNodeSelector (CORE-15)
- Remove all Java source code (src/)
- Remove Gradle build system (build.gradle, gradle/, gradlew, settings.gradle)
- Remove Java tooling config (lombok.config, micronaut-cli.yml)
- Remove legacy Kubernetes manifests (kube/)
- Remove old integration scripts (scripts/)
- Remove legacy planning directories (issue-analysis/, v2/)
- Remove build artifacts (build/, .gradle/)

Archive branch archive/java-operator-v1 preserves Java code from master.
Go operator in locust-k8s-operator-go/ remains intact.
…y root

Move all Go operator files from locust-k8s-operator-go/ to repository root using git mv to preserve file history. The Go operator is now the primary codebase and belongs at the root level.

Files moved:
- Go module (go.mod, go.sum)
- Source code (cmd/, api/, internal/, test/)
- Build system (Makefile, Dockerfile, PROJECT)
- Config (config/, hack/, .golangci.yml, .dockerignore)
- DevContainer (.devcontainer/)

Files removed:
- locust-k8s-operator-go/README.md (root README is canonical)
- locust-k8s-operator-go/.gitignore (will merge in Plan 03)
- Build artifacts (bin/, venv/, *.out)
…e for root-level structure

- Remove working-directory from build-go job in ci.yaml
- Update go-version-file paths from locust-k8s-operator-go/go.mod to go.mod
- Remove working-directory from golangci-lint and ko build steps
- Update artifact paths to reference root-level cover.out
- Update E2E workflow path filters to trigger on root-level Go directories
- Change Makefile IMAGE_TAG_BASE to io/locust-k8s-operator (remove -go)
- Change Makefile KIND_CLUSTER to locust-k8s-operator-test-e2e
- Update buildx builder name to locust-k8s-operator-builder
- Change PROJECT projectName to locust-k8s-operator
- Replace root .gitignore with merged Go and project entries
…-k8s-operator

- Change namespace from locust-k8s-operator-go-system to locust-k8s-operator-system
- Update namePrefix in config/default/kustomization.yaml
- Update all resource names, labels, and references across config/ manifests
- Update E2E test constants: namespace, serviceAccountName, metricsServiceName
- Update projectImage reference in e2e_suite_test.go
- Update clusterrole name in metrics test
- Change CSV filename reference in config/manifests/kustomization.yaml
- All config and test files now use standard locust-k8s-operator naming
- Create MIGRATION.md documenting Java-to-Go transition with archive branch reference
- Create SECURITY.md with vulnerability reporting policy
- Create .editorconfig with Go and YAML formatting standards
- Update README.md to reflect Go operator (remove Java usage instructions)
- Update .pre-commit-config.yaml to remove Java-specific hooks (gradle-check, gradle-spotless)
- Update .cz.yaml to remove build.gradle reference and set version to 2.0.0
- Remove .devcontainer/devcontainer.json
- Remove .devcontainer/post-install.sh
… code clarity

- Change Requeue: true to RequeueAfter: time.Second for resource deletion detection
- Update unit tests to assert RequeueAfter > 0 instead of Requeue == true
- Preallocate formatErrors slice with capacity for better performance
- Replace if-else chain with switch statement in buildResourceRequirementsWithPrecedence
- Move github.com/go-logr/logr from indirect to direct dependency
…metrics setup

- Wrap metrics-cert-name flag description to comply with line length limits
- Add nolint:lll directive to setupMetricsServer function signature
- Rename lib volume from "lib" to "locust-lib" for clarity
- Add cluster_name to E2E workflow Kind cluster creation
- Fix StartTime to only set on first creation (preserve across reconciles)
- Add Ready=false status when test fails
- Add WorkersConnected=false condition when workers missing
- Fix flag conflict detection to match exact flags (not just prefixes)
- Update all tests to reflect locust-lib volume name
- Change testdata path from "test/e2e/testdata" to "testdata" in all E2E tests
- Update locusttest_e2e_test.go, otel_e2e_test.go, v1_compatibility_test.go, and validation_e2e_test.go
@AbdelrhmanHamouda AbdelrhmanHamouda merged commit 38c14a5 into feat/go Feb 10, 2026
6 of 8 checks passed
@AbdelrhmanHamouda AbdelrhmanHamouda deleted the 001-go-operator-manual-test-plan branch February 10, 2026 11:46
AbdelrhmanHamouda added a commit that referenced this pull request Feb 13, 2026
* chore: prep work for v2

* feat: scaffold Go operator with Operator SDK v1.42.0 (#261)

* feat: implement v1 API types matching Java CRD schema (#262)

* feat: add phase 1 v1 API types implementation plan

* feat: implement v1 API types matching Java CRD schema

* feat: implement configuration system for Go operator  (#263)

* feat: implement configuration system for Go operator

- Add OperatorConfig struct with 26 fields matching Java SysConfig
- Implement LoadConfig() with environment variable loading
- Add helper functions: getEnv, getEnvBool, getEnvInt32, getEnvInt32Ptr
- Achieve 100% test coverage for config package
- Update Phase 2 notes and checklist

* feat: add Go workflows to root .github/workflows for PR execution

* fix: rename workflow jobs and fix invalid ubuntu-slim runner

* chore: cleanup readme

* feat: implement Phase 3 resource builders for Jobs and Services

* feat: implement core reconciler for LocustTest CR (#265)

* feat: implement core reconciler for LocustTest CR

- Add Reconcile() with NO-OP on updates (generation > 1)
- Add createResources() to build and create Service + Jobs
- Add createResource() helper with owner references
- Wire config and event recorder in main.go
- Add RBAC for jobs, services, and events
- Configure controller with Owns() and GenerationChangedPredicate

* style: align inline comments in SetupWithManager method

* docs: update phase 4 status to complete

* test: add tests  (#266)

* test: add comprehensive unit tests for controller and resources (Phase 5)

- Add 17 controller unit tests using fake client for 77.3% coverage
- Add 10 edge case tests for resources package (97.7% coverage)
- Create test fixtures in internal/testdata/ for reusable test data
- Add LoadLocustTest/MustLoadLocustTest fixture helpers
- All coverage targets exceeded (config 100%, controller 77%, resources 98%)

* test: add error path tests for controller achieving 97.7% coverage

- Add tests for API Get error handling
- Add tests for resource creation failures (Service, Master Job, Worker Job)
- Add test for SetControllerReference error
- Controller coverage improved from 77.3% to 97.7%
- Total project coverage now at 98.0%

* test: add controller integration tests using envtest framework

* feat: implement v2 API types with grouped configuration and new features (#267)

* feat: implement v2 API types with grouped configuration and new features

* chore: fix lint

* feat: implement v1↔v2 conversion webhook with E2E tests (#268)

* feat: implement v1↔v2 conversion webhook with E2E tests

- Add Hub() marker to v2 LocustTest
- Implement ConvertTo/ConvertFrom in v1
- Mark v2 as storage version
- Add deprecation warning to v1
- Add conversion unit tests (15 tests)
- Add E2E tests for webhook in Kind cluster (7 tests)
- Configure cert-manager and webhook kustomize
- Generate webhook configuration

* fix: resolve golangci-lint warnings in conversion code

* test: fix tests

* feat: implement status subresource and migrate to v2 API (#269)

* feat: implement status subresource and migrate to v2 API

- Migrate controller and resource builders from v1 to v2 API
- Add status tracking with phase, conditions, and worker counts
- Create status.go with helper functions for status management
- Update all test files to use v2 API types
- Simplify test infrastructure by removing v1-only test CRD
- Integration tests now use main CRD with v2 as storage version

* feat: implement environment and secret injection (Issue #149)

* fix: return nil from BuildEnvFrom when no refs are configured

* fix: refetch LocustTest before status update to prevent conflicts

- Add Get() call before status update to obtain latest resource version
- Wrap spec updates in Eventually() to handle concurrent status updates
- Add status initialization waits before attempting spec updates
- Prevents "the object has been modified" errors in integration tests

* feat: implement volume mounting with target filtering (Issue #252) (#270)

* feat: add OpenTelemetry support for native Locust metrics export (#271)

* feat: rewrite Helm chart for Go operator v2.0.0 and revamp ci and tests (#272)

* feat: rewrite Helm chart for Go operator v2.0.0

* feat: rewrite CI/CD pipeline for Go operator

- Replace Java/Gradle build with Go build/lint/test in ci.yaml
- Replace Jib with ko for multi-arch Docker builds in release.yaml
- Add ci and ci-coverage Makefile targets
- Delete redundant go-lint.yml, go-test.yml, integration-test.yml

* test: add comprehensive E2E tests for LocustTest lifecycle

* fix: update Helm chart CRD symlink to Go operator CRD

* fix: change CI trigger from pull_request_target to pull_request

* fix: correct testdata paths in E2E tests and resolve integration test conflict

* fix: deploy operator in E2E BeforeSuite to create namespace and CRDs

* fix: add time unit to Helm timeout in chart-testing config

* fix: revert operator deployment from BeforeSuite to avoid conflict with e2e_test.go

* fix: consolidate operator deployment to BeforeSuite to resolve test conflicts

Moved operator deployment from e2e_test.go BeforeAll to e2e_suite_test.go
BeforeSuite to avoid duplicate deployments. The make deploy command includes
CRDs, namespace creation, and operator deployment, which was causing conflicts
when run multiple times across different test files.

* fix: use ko to build operator image with dynamic version for helm CI

- Install ko build tool in helm CI job
- Extract appVersion from Chart.yaml dynamically
- Build operator image with ko using chart's appVersion as tag
- Load image into kind cluster to fix image pull failures
- Set image.pullPolicy=Never to use local image
- Steps only run when chart changes detected

* fix: correct v2 testdata image field to string instead of object

* fix: use full Go import path for ko build in helm CI

- Change ko build from ./cmd/main.go to full import path
- Extract appVersion from Chart.yaml dynamically
- Build operator image with ko using chart's appVersion as tag
- Load image into kind cluster to fix image pull failures
- Set image.pullPolicy=Never to use local image

* fix: use full Go import path for ko build in helm CI

- Change ko build from ./cmd/main.go to full import path
- Extract appVersion from Chart.yaml dynamically
- Build operator image with ko using chart's appVersion as tag
- Load image into kind cluster to fix image pull failures
- Set image.pullPolicy=Never to use local image

* fix: use ko to build operator image with correct package path for helm CI

- Install ko build tool in helm CI job
- Extract appVersion from Chart.yaml dynamically
- Build operator image with ko using ./cmd package directory
- Load image into kind cluster to fix image pull failures
- Set image.pullPolicy=Never to use local image

* fix: specify kind cluster name for image loading in helm CI

Add --name chart-testing flag to kind load command to ensure operator image is loaded into the correct cluster created by chart-testing tool.

* fix: correct container names in e2e tests

Updated e2e tests to use actual container names (crName+"-master"/"-worker")
instead of hardcoded "locust" when querying Job container properties via
JSONPath. Container names are dynamically generated as nodeName in the
controller, not a static "locust" value.

* fix: update helm chart for ko-built images compatibility

- Remove hardcoded /manager command override
- Add explicit UID/GID 65532 for distroless compatibility
- Document ko's automatic entrypoint handling

* fix: add ENABLE_WEBHOOKS environment variable to helm chart

- Pass webhook.enabled value as ENABLE_WEBHOOKS env var
- Prevents operator from registering webhooks when disabled
- Fixes crash when webhook certs are not available

* fix: configure kustomize deployment for ko and webhooks

- Remove hardcoded /manager command from base deployment
- Add webhook-cert-path argument to webhook patch
- Enables v1 to v2 conversion webhook for E2E tests

* fix: enable metrics server in kustomize webhook deployment

- Add metrics-bind-address argument to webhook patch
- Fixes E2E test expecting metrics server to start
- Webhook patch now includes all required arguments

* docs: add v2.0 migration guide and update documentation for new API

- Add v2.0 announcement section to README with performance improvements table
- Document new v2 features: OpenTelemetry, secret injection, volume mounting
- Update advanced topics with v2 API examples alongside v1 (deprecated) examples
- Add separate resource specs documentation for master/worker pods
- Document OpenTelemetry integration with environment variables and configuration
- Add environment and secret injection examples (

* docs: add anchor links to v2 feature sections in advanced topics

- Add HTML anchor IDs to OpenTelemetry, Environment Injection, Volume Mounting, and Separate Resource Specs sections
- Update index.md to link to specific sections instead of generic advanced_topics.md
- Move v2.0 announcement from admonition to grid cards layout on homepage
- Remove duplicate "What's New" section from migration guide

* docs: fix and improve 

* docs: add node selector documentation and fix OTel extraEnvVars format in v2 API

- Add node selector section to advanced_topics.md with v2 API examples
- Document node selector vs affinity differences
- Fix OTel extraEnvVars from []corev1.EnvVar to map[string]string format
- Update OTel protocol documentation to include http/protobuf option
- Add node selector to sample CR v2 example
- Implement buildNodeSelector() in job.go to apply node selector to pods
- Fix getting_started.md v1 API example

* docs: update helm chart documentation for service account and metrics exporter resources

- Clarify serviceAccount.name behavior when empty vs create=false
- Add ephemeralStorage resource fields for metrics exporter (requests/limits)
- Update default serviceAccount.name value from "default" to empty string

* docs: add Locust test metrics documentation and implement configurable autostart/autoquit

- Add ServiceMonitor examples for Prometheus Operator (HTTP and HTTPS)
- Document operator metrics queries (reconciliation rate, errors, duration)
- Add comprehensive Locust test metrics section with exporter sidecar details
- Document available Locust metrics and PromQL query examples
- Add integration examples for Prometheus, Grafana, NewRelic, DataDog
- Clarify mutual exclusivity between Prometheus exporter and Open

* docs: add Kafka anchor link and expand v2 migration documentation

- Add anchor link to Kafka & AWS MSK configuration section
- Clarify operator resource defaults are for controller, not test pods
- Add v2-only fields to migration mapping table (resources, extraArgs, autostart, autoquit)
- Add affinity conversion warning footnote about lossy v2→v1 transformation
- Document complete list of v2-only fields lost during v1 conversion
- Group lossy conversion details by category (Master/Worker, Test Files, Scheduling

* docs: improve

* chore: Address all 86 review items from Go operator rewrite PR (#275)

* chore: add agentic workflow directories to gitignore

Add .windsurf/, .planning/, .specify/, specs/, CLAUDE.md, and GEMINI.md to gitignore to exclude AI-assisted development workflow files and specifications from version control.

* docs: initialize project

Comprehensive fix pass on Go operator rewrite (PR #274) — addressing all 9 critical, ~25 major, and 50+ minor issues from 8 expert reviewers, plus repo restructuring.

* chore: remove tracked planning docs (now gitignored)

* test(01-01): add failing tests for extraArgs wiring and conflict detection

- Add tests for BuildMasterCommand with extraArgs
- Add tests for BuildWorkerCommand with extraArgs
- Add tests for detectFlagConflicts function
- Add tests for conflict warning logging
- Update existing tests to pass new logger parameter
- Tests currently fail (RED phase) - functions not yet implemented

* feat(01-01): implement extraArgs wiring with conflict detection

- Add operatorManagedFlags registry containing all operator-controlled flags
- Implement detectFlagConflicts function to identify conflicting extraArgs
- Wire extraArgs into BuildMasterCommand with conflict warning logging
- Wire extraArgs into BuildWorkerCommand with conflict warning logging
- Update BuildMasterJob and BuildWorkerJob to accept logger parameter
- Update controller to pass logger to job builders
- Update all test calls to pass logger parameter
- ExtraArgs are appended after operator-managed flags (POSIX last-wins)
- All tests now pass (GREEN phase complete)

* test(01-02): add failing tests for safe resource parsing

RED Phase - Add tests for:
- LoadConfig() returns error for invalid resource quantities
- Multiple invalid values are collected and reported
- Valid resource quantities pass validation
- Empty resource strings are treated as optional

All existing LoadConfig() call sites updated to handle error return.

* feat(01-02): implement safe resource parsing and CR precedence

GREEN Phase - Implementation:

1. Config validation (config.go):
   - LoadConfig() now returns (*OperatorConfig, error)
   - validateResourceQuantities() checks all resource env vars
   - Invalid values produce clear error messages
   - Multiple invalid values collected and reported together
   - Empty strings treated as optional (not validated)

2. Safe parsing (job.go):
   - Replace resource.MustParse() with resource.ParseQuantity()
   - Safe because values pre-validated at startup
   - buildResourceList() now uses ParseQuantity with error ignored

3. Resource precedence (job.go):
   - buildResourceRequirementsWithPrecedence() implements CR → defaults
   - hasResourcesSpecified() helper distinguishes nil vs empty
   - CR resources are complete override (not partial merge)
   - buildLocustContainer() uses new precedence function

4. Main error handling (main.go):
   - Handles LoadConfig error, logs and exits with code 1
   - Clear error messages on startup failure

5. Test updates (suite_test.go):
   - Controller test suite handles new LoadConfig error return

All tests pass. No MustParse in production code.

* feat(01-03): add Helm masterResources and workerResources fields

- Add masterResources and workerResources to values.yaml (default empty)
- Add 12 helper templates in _helpers.tpl for role-specific resources
- Add conditional env vars in deployment.yaml (via envVars helper)
- Empty resources means 'use unified resources' (backward compatible)
- Helm lint passes, default rendering excludes role-specific env vars

* test(01-03): add integration tests for extraArgs and resource precedence

- Add 7 integration tests for BuildMasterJob/BuildWorkerJob
- Test extraArgs appear in generated Job command
- Test CR resources override operator defaults
- Test empty CR resources fall back to defaults
- Test master/worker resources are independent
- All tests pass, verify full Phase 1 wiring works end-to-end

* test(01-04): add failing tests for role-specific resource precedence

Add tests for 3-level resource precedence (CR > role-specific > unified):
- TestLoadConfig_RoleSpecificResourceDefaults: Verify 12 fields default to empty
- TestLoadConfig_RoleSpecificResourceOverrides: Verify env vars are loaded
- TestLoadConfig_InvalidRoleSpecificResource: Verify validation errors
- TestBuildMasterJob_WithHelmMasterResources: Verify master-specific overrides unified
- TestBuildWorkerJob_WithHelmWorkerResources: Verify worker-specific overrides unified
- TestBuildMasterJob_CROverridesHelmRoleSpecific: Verify CR wins over role-specific
- TestBuildMasterJob_HelmRoleSpecific_PrecedenceOverUnified: Verify field-level fallback

Tests fail to compile (RED phase) - fields don't exist yet.

* feat(01-04): implement role-specific config and 3-level resource precedence

Complete the three-level resource precedence chain:
- Level 1: CR-level resources (complete override, highest precedence)
- Level 2: Role-specific operator config (from Helm masterResources/workerResources)
- Level 3: Unified operator defaults (from Helm resources)

Config changes:
- Add 12 role-specific fields to OperatorConfig (6 master, 6 worker)
- LoadConfig reads MASTER_POD_* and WORKER_POD_* env vars with empty defaults
- validateResourceQuantities validates non-empty role-specific values

Job changes:
- buildResourceRequirementsWithPrecedence implements 3-level precedence
- Field-level fallback: empty role-specific → unified (backward compatible)
- CR resources remain complete override (not partial merge)

All tests pass (GREEN phase complete).

* fix(02-01): remove MutatingWebhookConfiguration for CRD conversion

- Remove MutatingWebhookConfiguration block (lines 60-90)
- CRD conversion uses spec.conversion.webhook on CRD, not admission webhook
- Keep ValidatingWebhookConfiguration for CR validation (correct)
- Keep webhook Service for routing API server traffic
- Update comments to clarify conversion is handled by CRD spec
- controller-runtime serves /convert endpoint automatically

* fix(02-01): add emptyDir /tmp volume and fix replicaCount default

- Add tmp emptyDir volume at /tmp for controller-runtime temp files
- Webhook certs mount overlays at subdirectory /tmp/k8s-webhook-server/serving-certs
- readOnlyRootFilesystem: true now works with webhook server
- Change replicaCount default from 2 to 1 (backward compatible with Java operator)
- Update comment to recommend 2+ replicas with leader election for HA
- No volumes/mounts when webhooks disabled (clean default install)

* feat(02-02): tighten RBAC to least privilege with conditional Leases

- ConfigMaps/Secrets reduced to read-only (get, list, watch)
- Jobs reduced to immutable pattern (get, list, watch, create, delete)
- Services reduced to create/delete lifecycle (no update/patch)
- Leases conditional on leaderElection.enabled flag
- Leases verbs reduced (removed delete, kept get/list/watch/create/update/patch)
- Updated header comments to reflect read-only and immutable patterns

* feat(02-02): fix backward compatibility helpers to check leaf values

- Changed all helpers from checking parent keys to checking full path with leaf values
- Pod resources: Check .Values.locustPods.resources.requests.cpu exists (not just .Values.locustPods)
- Boolean helpers (affinity, tolerations): Use hasKey to handle false values correctly
- TTL helper: Check actual ttlSecondsAfterFinished value exists
- Metrics exporter: Check full nested path for image, port, pullPolicy, and resources
- Kafka: Check bootstrapServers and security.enabled with full paths
- Updated header comment to explain leaf-value checking precedence
- Enables true backward compat: old config paths work when new paths are absent

* feat(02-03): add optional PodDisruptionBudget and remove no-op crd.install

- Add PDB template (disabled by default per user decision)
- Add podDisruptionBudget config section to values.yaml
- Remove crd.install no-op value (Helm crds/ always installs unconditionally)
- PDB supports HA deployments with replicaCount >= 2

* fix(02-03): replace CRD symlink with actual file for helm package compatibility

- Remove symlink to ../../locust-k8s-operator-go/config/crd/bases
- Copy locust.io_locusttests.yaml directly into charts/crds/
- Fixes helm package in CI environments where symlink target may not resolve

* refactor(03-01): unify volume name constant and fix mount path trailing slash

- Changed libVolumeName from "locust-lib" to "lib" in webhook to match resources package
- Removed trailing slash from DefaultMountPath ("/lotest/src/" → "/lotest/src")
- Updated test to validate "lib" instead of "locust-lib"
- Both packages now use "lib" as the canonical lib volume name

* feat(03-02): remove generation guard and implement phase-based reconciliation

- Remove Generation > 1 NO-OP guard from reconciler
- Phase-based state machine now drives all reconciliation
- Pending phase creates resources regardless of generation number
- Add V(1) info message for generation > 1 (informational only)
- Rewrite tests to verify phase-based behavior
- Fixes operator-restart edge case where modified-but-unprocessed CRs would never get resources created

* feat(03-02): suppress Prometheus annotations when OTel is enabled

- Add IsOTelEnabled check to BuildAnnotations for master pods
- When OTel is enabled, Locust exports metrics natively via OTLP
- No sidecar container or Prometheus scrape annotations needed with OTel
- Pattern matches existing OTel suppression in service.go and job.go
- Add test TestBuildAnnotations_Master_NoPrometheusWhenOTelEnabled
- Rename existing test for clarity

* feat(03-03): detect externally deleted resources and self-heal

- Check for missing Service, master Job, and worker Job during reconcileStatus
- Reset Phase to Pending when resources are externally deleted
- Emit Warning event with descriptive message
- Pending phase triggers createResources on next reconcile (self-healing loop)
- Standard controller-runtime requeue handles exponential backoff

* test(03-03): add external deletion detection and recovery tests

- TestReconcile_ExternalDeletion_MasterService: verify Service deletion triggers recovery
- TestReconcile_ExternalDeletion_MasterJob: verify master Job deletion triggers recovery
- TestReconcile_ExternalDeletion_WorkerJob: verify worker Job deletion triggers recovery
- All tests verify Warning event emission, Phase reset to Pending, and self-healing recreation
- Full test suite passes (21/21 controller integration tests)

* fix(04-01): correct publish-image job — ko path, version, permissions, timeout

- Fix ko build path from ./cmd/main.go to ./cmd (CICD-01)
- Align ko version to @v0.7 matching ci.yaml (CICD-02)
- Remove unnecessary packages: write permission (CICD-12)
- Add timeout-minutes: 30 for job reliability (CICD-03)

* feat(04-01): enforce release ordering and replace chart-releaser fork

- Replace askcloudarchitech fork with upstream helm/chart-releaser-action@v1 (CICD-05)
- Add needs: [publish-image] to helm-chart-release for proper ordering
- Add timeout-minutes: 15 to helm-chart-release (CICD-03)
- Add timeout-minutes: 15 to docs-release (CICD-03)
- Remove stale comments about upstream PR
- Keep docs-release independent for parallel execution

* feat(04-02): parallelize CI jobs, add timeouts, fix ct.yaml timeout

- Remove needs: dependencies from lint-test-helm and docs-test jobs
- Jobs now run in parallel (build-go, lint-test-helm, docs-test)
- Add timeout-minutes: 30 to build-go and lint-test-helm jobs
- Add timeout-minutes: 15 to docs-test job
- Increase ct.yaml helm timeout from 120s to 300s for operator deployments

* feat(04-02): add artifact uploads on CI job failures

- Upload Go test artifacts (cover.out) on build-go failure
- Upload kind cluster logs on lint-test-helm failure
- Upload docs build output (site/) on docs-test failure
- All artifacts retained for 7 days

* feat(04-03): overhaul E2E workflow with security and reliability fixes

- Add permissions: read-all for least privilege (CICD-06)
- Replace manual Kind download with helm/kind-action@v1.12.0 (CICD-07)
- Add go mod tidy verification check on go.mod and go.sum (CICD-08)
- Add timeout-minutes: 30 to prevent runaway jobs (CICD-03)
- Add artifact upload on failure for debugging (CICD-11)

* chore(04-03): remove dead nested workflow files

- Remove locust-k8s-operator-go/.github/workflows/lint.yml
- Remove locust-k8s-operator-go/.github/workflows/test.yml
- Remove locust-k8s-operator-go/.github/workflows/test-e2e.yml

These workflows are dead code - GitHub Actions only reads from repo root
.github/workflows/. The nested workflows were leftover from when the Go
operator was a standalone repository. All functionality is covered by
root-level ci.yaml, go-test-e2e.yml, and release.yaml.

* fix(05-01): fix imagePullSecrets format and reframe Kafka docs

- Fix imagePullSecrets to use LocalObjectReference format (- name: gcr-secret)
- Reframe Kafka section to explain two-level config model
- Document operator-level centralized configuration approach
- Document per-test override capability
- Remove deprecated framing from Kafka documentation

* fix(05-01): fix metrics.secure comment and replace jaeger with otlphttp

- Correct metrics.secure comment to state default is false
- Replace deprecated jaeger exporter with otlphttp in OTel collector config
- Update exporter endpoint to http://jaeger-collector:4318 (OTLP HTTP)

* docs(05-02): fix status example and update copyright/roadmap

- Replace non-existent masterJob/workerJob fields with actual status fields
- Add expectedWorkers, connectedWorkers, startTime, conditions
- Update copyright year from 2025 to 2026
- Delete stale roadmap.md file
- Remove roadmap from mkdocs.yml navigation

* docs(05-02): fix README framework reference

- Change "Operator SDK" to "controller-runtime" in comparison table
- Removes ambiguity (Operator SDK could mean CLI tool or framework)
- controller-runtime is the actual framework used (verified in go.mod)

* docs(05-03): standardize namespaces, add missing image field, fix CPU limit

- helm_deploy.md: add --namespace locust-system --create-namespace to all install examples
- getting_started.md: add missing image field to v2 lib configmap example
- migration.md: fix operator CPU limit from 100m to 500m (matches values.yaml)

Fixes: DOCS-02, DOCS-04, DOCS-06

* docs(05-03): remove subdirectory references and fix working dir instructions

- Remove all locust-k8s-operator-go/ subdirectory references across docs
- Update local-development.md with note about dev vs production namespaces
- Update integration-testing.md directory tree to show repo root structure
- Update pull-request-process.md to remove subdirectory path reference
- Simplify Kind cluster name in integration-testing.md

Fixes: DOCS-05, DOCS-12

* fix(06-02): remove os.Chdir side effect and fix E2E label selectors

- Remove os.Chdir from test/utils/utils.go (global side effect causing test interference)
- Fix E2E conversion script to use correct label selectors (performance-test-pod-name)
- Replace app=locust-master/worker with performance-test-pod-name=<name>-master/worker

* feat(06-01): add typed Phase enum, ObservedGeneration, and phase transition events

- Define Phase as typed enum (type Phase string) for compile-time safety
- Add ObservedGeneration field to LocustTestStatus for controller progress tracking
- Update derivePhaseFromJob to return typed Phase
- Add phase transition event recording (TestStarted, TestCompleted, TestFailed)
- Update ObservedGeneration on all status updates in controller
- Document ConnectedWorkers as approximation from Job.Status.Active
- Remove unused RestartPolicyNever constant (code uses corev1.RestartPolicyNever)
- Remove unused LocustContainerName constant

* feat(06-04): add Helm chart metadata, schema validation, and NOTES.txt

- Add Chart.yaml metadata: kubeVersion, home, sources, maintainers
- Add fullnameOverride and nameOverride support in values and _helpers.tpl
- Add extraEnv support for custom operator environment variables
- Add terminationGracePeriodSeconds configuration
- Add configurable podSecurityContext and containerSecurityContext
- Add health_check extension to OTel Collector config
- Create values.schema.json for Helm value validation
- Create NOTES.txt with post-install instructions and examples

* feat(06-04): add health probes, Kafka gating, and deployment configurability

- Add terminationGracePeriodSeconds to deployment from values
- Use configurable podSecurityContext and containerSecurityContext
- Add extraEnv support to deployment env section
- Gate Kafka environment variables behind kafka.enabled condition
- Add ConfigMap checksum annotation to OTel Collector for auto-rollout
- Add liveness and readiness HTTP probes on port 13133 to OTel Collector
- Add health port (13133) to OTel Collector container ports

* test(06-02): add webhook update tests, boundary tests, and v2 fixtures

- Add TestValidateUpdate_Invalid for webhook update validation (secret mounts, volumes, OTel)
- Add TestValidateCreate_LongCRName boundary test for CR name length validation
- Add validateCRName function to enforce 63-char limit on generated resource names
- Create locusttest-with-scheduling.yaml E2E fixture with affinity and tolerations
- Add LoadV2Fixture/MustLoadV2Fixture functions for v2 test fixtures
- Create locusttest_v2_full.json comprehensive v2 fixture

* test(06-01): add comprehensive status transition tests and fix Phase type in tests

- Add TestUpdateStatusFromJobs_FullStateMachine covering all phase transitions
- Add TestUpdateStatusFromJobs_PhaseTransitionEvents verifying event emission
- Add TestUpdateStatusFromJobs_NoEventOnSamePhase for idempotency verification
- Add TestUpdateStatusFromJobs_ObservedGeneration verifying generation tracking
- Add TestDerivePhaseFromJob_TypeSafety verifying typed Phase return
- Add TestUpdateStatusFromJobs_WorkersConnectedCondition for worker tracking
- Fix existing tests to use typed Phase (string conversion for comparisons)
- Update CRD manifests for ObservedGeneration field

* test(06-06): remove v1 webhook unused logger and add OTel integration tests

- Remove unused logf import and blank identifier assignment in api/v1/locusttest_webhook.go
- Add comprehensive OTel integration tests in internal/resources/env_test.go
  - TestBuildEnvVars_WithOTel_Enabled: verifies OTel env vars correctly merged with Kafka and user vars
  - TestBuildEnvVars_WithOTel_Disabled: verifies no OTel vars when disabled
  - TestBuildEnvVars_WithOTel_NoObservabilityConfig: verifies no OTel vars when config absent
  - TestBuildEnvVars_OTel_HTTPProtocol: verifies HTTP protocol support
  - TestBuildEnvVars_OTel_EnvVarOrder: verifies correct precedence (Kafka, OTel, user)

Note: PDB template was already implemented in a previous phase

* feat(06-03): add gosec, errorlint, exhaustive linters and wrap reconciler errors

- Add gosec, errorlint, exhaustive linters to .golangci.yml
- Configure errorlint with all checks enabled
- Configure exhaustive with strict mode (default-signifies-exhaustive: false)
- Add error wrapping with %w to all reconciler error returns
- Add PhasePending case to switch statement (exhaustive requirement)
- Fix gofmt formatting in config.go
- Add justified nolint:gosec directives for test code with known safe inputs

* refactor(06-03): move BuildKafkaEnvVars to env.go and replace custom string helpers

- Move BuildKafkaEnvVars function from job.go to env.go for logical grouping
- Move strconv import from job.go to env.go (only used by BuildKafkaEnvVars)
- Replace custom contains()/containsHelper() with stdlib strings.Contains()
- Add strings import to controller unit test

* refactor(06-05): extract main.go helpers and update leader election ID

- Extract parseFlags helper for command-line flag parsing
- Extract configureTLS helper for TLS options
- Extract setupWebhookServer helper for webhook configuration
- Extract setupMetricsServer helper for metrics configuration
- Extract setupControllers helper for controller registration
- Extract setupHealthChecks helper for health probes
- Remove nolint:gocyclo directive (no longer needed)
- Change leader election ID to locust-k8s-operator.locust.io (CORE-27)

* chore(06-05): update project metadata, dockerfile, build files, and RBAC

- Update PROJECT file to track v2 API (CORE-21)
- Pin Dockerfile golang version to 1.24.0 (CORE-23)
- Add comprehensive .dockerignore for build optimization (CORE-18)
- Add cover.out, coverage.out, venv/ to .gitignore (CORE-20)
- Remove create/delete verbs from locusttests RBAC (CORE-28)
- Remove finalizers permission from RBAC (CORE-29)
- Update kubebuilder RBAC markers in controller
- Add map ordering comment to buildNodeSelector (CORE-15)

* chore(07-01): remove legacy Java operator code and build files

- Remove all Java source code (src/)
- Remove Gradle build system (build.gradle, gradle/, gradlew, settings.gradle)
- Remove Java tooling config (lombok.config, micronaut-cli.yml)
- Remove legacy Kubernetes manifests (kube/)
- Remove old integration scripts (scripts/)
- Remove legacy planning directories (issue-analysis/, v2/)
- Remove build artifacts (build/, .gradle/)

Archive branch archive/java-operator-v1 preserves Java code from master.
Go operator in locust-k8s-operator-go/ remains intact.

* refactor(07-02): move Go operator code from subdirectory to repository root

Move all Go operator files from locust-k8s-operator-go/ to repository root using git mv to preserve file history. The Go operator is now the primary codebase and belongs at the root level.

Files moved:
- Go module (go.mod, go.sum)
- Source code (cmd/, api/, internal/, test/)
- Build system (Makefile, Dockerfile, PROJECT)
- Config (config/, hack/, .golangci.yml, .dockerignore)
- DevContainer (.devcontainer/)

Files removed:
- locust-k8s-operator-go/README.md (root README is canonical)
- locust-k8s-operator-go/.gitignore (will merge in Plan 03)
- Build artifacts (bin/, venv/, *.out)

* chore(07-03): update CI/CD workflows, Makefile, PROJECT, and gitignore for root-level structure

- Remove working-directory from build-go job in ci.yaml
- Update go-version-file paths from locust-k8s-operator-go/go.mod to go.mod
- Remove working-directory from golangci-lint and ko build steps
- Update artifact paths to reference root-level cover.out
- Update E2E workflow path filters to trigger on root-level Go directories
- Change Makefile IMAGE_TAG_BASE to io/locust-k8s-operator (remove -go)
- Change Makefile KIND_CLUSTER to locust-k8s-operator-test-e2e
- Update buildx builder name to locust-k8s-operator-builder
- Change PROJECT projectName to locust-k8s-operator
- Replace root .gitignore with merged Go and project entries

* chore(07-03): update all config manifests and E2E tests to use locust-k8s-operator

- Change namespace from locust-k8s-operator-go-system to locust-k8s-operator-system
- Update namePrefix in config/default/kustomization.yaml
- Update all resource names, labels, and references across config/ manifests
- Update E2E test constants: namespace, serviceAccountName, metricsServiceName
- Update projectImage reference in e2e_suite_test.go
- Update clusterrole name in metrics test
- Change CSV filename reference in config/manifests/kustomization.yaml
- All config and test files now use standard locust-k8s-operator naming

* docs(07-04): add MIGRATION.md, SECURITY.md, .editorconfig, update README

- Create MIGRATION.md documenting Java-to-Go transition with archive branch reference
- Create SECURITY.md with vulnerability reporting policy
- Create .editorconfig with Go and YAML formatting standards
- Update README.md to reflect Go operator (remove Java usage instructions)
- Update .pre-commit-config.yaml to remove Java-specific hooks (gradle-check, gradle-spotless)
- Update .cz.yaml to remove build.gradle reference and set version to 2.0.0

* chore: fix struct alignment and update test CRD with observedGeneration

* chore(07-04): remove DevContainer configuration files

- Remove .devcontainer/devcontainer.json
- Remove .devcontainer/post-install.sh

* chore(06-05): optimize requeue timing, preallocate slice, and improve code clarity

- Change Requeue: true to RequeueAfter: time.Second for resource deletion detection
- Update unit tests to assert RequeueAfter > 0 instead of Requeue == true
- Preallocate formatErrors slice with capacity for better performance
- Replace if-else chain with switch statement in buildResourceRequirementsWithPrecedence
- Move github.com/go-logr/logr from indirect to direct dependency

* chore(06-05): fix line length formatting in main.go flag parsing and metrics setup

- Wrap metrics-cert-name flag description to comply with line length limits
- Add nolint:lll directive to setupMetricsServer function signature

* test(06-05): add nolint directive for intentional type assertion in status test

* chore: exclude goconst linter from test files in golangci-lint config

* chore: standardize lib volume name and improve status/flag handling

- Rename lib volume from "lib" to "locust-lib" for clarity
- Add cluster_name to E2E workflow Kind cluster creation
- Fix StartTime to only set on first creation (preserve across reconciles)
- Add Ready=false status when test fails
- Add WorkersConnected=false condition when workers missing
- Fix flag conflict detection to match exact flags (not just prefixes)
- Update all tests to reflect locust-lib volume name

* test(e2e): fix testdata directory path to use relative path

- Change testdata path from "test/e2e/testdata" to "testdata" in all E2E tests
- Update locusttest_e2e_test.go, otel_e2e_test.go, v1_compatibility_test.go, and validation_e2e_test.go

* fix: prevent autoquit timeout=0 override and protect operator-critical labels

- Change autoquit timeout validation from > 0 to >= 0 to allow explicit 0 value
- Add label protection to prevent user labels from overriding LabelPodName and LabelManagedBy
- Ensures operator-critical labels cannot be accidentally overwritten by user-defined labels

* 1 pr review fixes (#276)

* fix(08-01): update NOTES.txt to v2 API and fix sample CR command format

- Replace v1alpha1 example in NOTES.txt with valid v2 example
- Remove non-existent fields (masterCommand, workerCommand, workers)
- Fix command format in all v2 sample CRs (remove 'locust' prefix)
- Add port-forward instructions for web UI access to NOTES.txt

* docs(08-01): add web UI access section to getting started guide

- Add Step 5.2 explaining how to access Locust web UI
- Include kubectl port-forward instructions and examples
- Add tip about autostart mode and web UI monitoring

* docs(09-03): document immutable test model with delete+recreate guidance

- Expanded Immutable Tests section in how_does_it_work.md
- Added SpecDrifted condition documentation
- Added kubectl delete and kubectl apply example commands
- Added warning admonition in getting_started.md before first CR deployment
- Cross-referenced how_does_it_work.md for detailed explanation

* feat(09-01): add SpecDrifted condition and re-fetch before status updates

- Add ConditionTypeSpecDrifted and ReasonSpecChangeIgnored constants
- Re-fetch CR before all Status().Update() calls to prevent conflicts
- Set SpecDrifted condition when Generation > 1 on non-Pending tests
- Replace _ = oldPhase with phase transition logging
- Add re-fetch logic in reconcileStatus for all 3 recovery paths

* fix(09-02): set production logger default and add env var parse warnings

- Change Development: false in zap.Options (was true)
- Add log.Printf warnings in getEnvBool for unparseable values
- Add log.Printf warnings in getEnvInt32 for unparseable values
- Add log.Printf warnings in getEnvInt32Ptr for unparseable values

* test(09-01): add tests for SpecDrifted condition and phase logging

- Add TestUpdateStatusFromJobs_SpecDriftedCondition: verifies SpecDrifted is set when Generation > 1
- Add TestUpdateStatusFromJobs_NoSpecDriftedOnGeneration1: verifies no SpecDrifted when Generation == 1
- Existing TestUpdateStatusFromJobs_PhaseTransitionEvents validates phase transition logging implicitly

* test(09-02): add tests verifying warning logs for invalid env var values

- Add captureLogOutput helper to capture log.Printf output
- Add TestGetEnvBool_WarnsOnInvalidValue
- Add TestGetEnvInt32_WarnsOnInvalidValue
- Add TestGetEnvInt32Ptr_WarnsOnInvalidValue

* feat(10-02): set HA defaults and increase memory limit to 256Mi

- Changed replicaCount from 1 to 2 (HA default with leader election)
- Enabled PodDisruptionBudget by default (ensures webhook availability)
- Increased manager memory limit from 128Mi to 256Mi in both Helm and kustomize
- Updated comments to reflect HA default and controller-runtime justification

* fix(10-01): remove fragile join-then-split pattern from command builder

- Change BuildMasterCommand and BuildWorkerCommand to return cmdParts directly
- Split command seed using strings.Fields at append time (not at return)
- Eliminates join-then-split round-trip that would break args with spaces
- All existing command_test.go tests pass

* feat(10-02): sync RBAC permissions between kustomize and Helm

- Added configmaps and secrets read permissions to kustomize role.yaml
- Removed unused finalizers permissions from Helm (operator doesn't use finalizers)
- Removed create/delete verbs for locusttests in Helm (only users create CRs)
- Updated Helm RBAC header comment to reflect accurate permissions
- Both deployment methods now have identical RBAC rules

* feat(10-03): add conditional CRD conversion webhook strategy

- Move CRD from crds/ to templates/ for template rendering
- Add conversion.strategy: Webhook when webhook.enabled=true
- Add cert-manager CA injection annotation when certManager enabled
- CRD now supports multi-version conversion via /convert endpoint

* feat(10-01): add SecurityContext to Locust test pods

- Add PodSecurityContext with runAsNonRoot: true and seccompProfile: RuntimeDefault
- Apply to both master and worker pods in buildJob function
- Add boolPtr helper function for SecurityContext pointer fields
- Add TestBuildMasterJob_HasSecurityContext and TestBuildWorkerJob_HasSecurityContext
- No readOnlyRootFilesystem per decision (Locust needs writable pip cache)
- All 171 internal/resources tests pass

* docs(11-01): fix Go version references and Helm docs accuracy

- Update Go 1.23+ to Go 1.24+ in local-development.md, contribute.md, integration-testing.md
- Update operator memory limit from 128Mi to 256Mi in helm_deploy.md
- Remove phantom crd.install parameter from helm_deploy.md

* docs(11-01): delete non-functional v1 sample CR stub

- Delete config/samples/locust_v1_locusttest.yaml (non-functional stub with TODO comment)
- Update kustomization.yaml to reference locust_v2_locusttest.yaml
- Ensures kustomize build produces valid sample CR

* docs(11-02): rewrite CONTRIBUTING.md for Go/Make/Kind workflow

- Replace stale Java/Gradle/Jib references with Go tooling
- Update local testing from Minikube to Kind
- Add Go 1.24+ prerequisites and make targets
- Document make build/test/lint/manifests/ci commands
- Keep original sections (bug reporting, preface, PR process) unchanged

* docs(11-02): add webhook warning and CRD upgrade step to migration guide

- Add danger admonition about webhook requirement for v1 CR compatibility
- Update upgrade command to include --set webhook.enabled=true
- Add CRD Upgrade note explaining automatic Helm CRD handling
- Update operator memory limit to 256Mi (from 128Mi)

* chore(rbac): remove unused configmaps and secrets permissions from ClusterRole

Remove read-only permissions for configmaps and secrets from manager-role. These permissions are not required by the operator's current functionality.

* chore: improve Dockerfile build, Helm chart UX, and add logging utilities

- Change Dockerfile to copy entire cmd/ directory instead of single main.go file
- Update go build command to use ./cmd/ package path for better compatibility
- Add chart icon URL to Chart.yaml for Helm repository display
- Simplify NOTES.txt with concise emoji-based formatting (115 lines → 35 lines)
- Set fullnameOverride default to "locust-operator" for predictable resource names
- Add finalizers subresource permissions

* fix(10-04): add webhook cert polling and mark secret optional in Helm

- Add polling loop in main.go to wait for cert-manager to issue webhook certificates
- Poll cert/key files every 1s until both exist before starting certwatcher
- Mark webhook-certs secret as optional: true in Helm deployment
- Prevents operator crash when webhook enabled but certs not yet issued
- Fixes race condition during initial deployment with cert-manager

* feat(12-01): replace .Owns with .Watches for pod health monitoring

- Remove .Owns(&corev1.Pod{}) which never triggered (pods owned by Jobs)
- Add .Watches() with custom mapPodToLocustTest function
- mapPodToLocustTest traverses Pod→Job→LocustTest owner reference chain
- Add required imports: handler, reconcile, types

* fix(12-01): fix broken tests for updateStatusFromJobs signature change

- Add healthyPodStatus() helper for tests not testing pod health
- Update TestInitializeStatus to expect 4 conditions (including PodsHealthy)
- Fix all updateStatusFromJobs calls to include PodHealthStatus parameter
- All existing tests now pass

* test(12-02): add unit tests for mapPodToLocustTest mapping function

- Test valid Pod→Job→LocustTest owner chain
- Test pod without owner references
- Test pod owned by non-Job resource
- Test pod owned by Job without LocustTest owner
- Test pod owned by deleted Job (graceful failure)
- Test mapping in different namespace
- Test multiple owner references

All 7 test cases pass

* test(12-02): add unit tests for pod health integration in status updates

- Test unhealthy pods transition phase to Failed
- Test terminal states (Succeeded) are not overridden by pod failures
- Test ConfigError, ImagePullError, CrashLoop reasons surface correctly
- Test grace period keeps phase Running with PodsStarting reason
- Verify PodsHealthy condition set correctly in all scenarios
- Verify both TestFailed and PodFailure events emitted on pod health failure

All 5 test cases pass, full suite passes (21 integration specs)

* feat: pod health

* test(13-01): add envtest integration test for pod health monitoring

- New "Pod Health Monitoring" Describe block in integration_test.go
- Test creates LocustTest CR, waits for Job creation, then simulates pod with CrashLoopBackOff status
- Verifies controller watches pods and updates PodsHealthy condition
- Test validates grace period logic: condition shows PodsStarting (True) during 2-minute grace period
- Proves full reconcile loop: pod state change -> pod watch triggers reconcile -> status updated
- All 22 integration tests pass (21 existing + 1 new)

* feat(13-02): add detailed logging and graceful retry for external resource deletion recovery

- Add structured logging before/after status updates in all 3 recovery paths (Service, master Job, worker Job)
- Log current phase, generation, observedGeneration before retry attempts
- Log re-fetch details (resourceVersion, phase) at V(1) verbosity level
- Log errors during re-fetch and status update with resource context
- Change exhausted retry behavior from error return to 5s requeue
- Add success logging after

* fix: resolve CI pipeline failures for linter and go mod checks

Remove unused functions causing golangci-lint failures and fix go.mod dependency classification to pass E2E tests.

Changes:
- Remove unused categorizePodFailures function and sort import from pod_health.go
- Remove unused boolPtr function from job.go
- Move go.uber.org/zap from indirect to direct dependencies in go.mod
- Update generated webhook patch with metrics port configuration

Fixes linter errors and go mod tidy check in CI pipeline.

* fix(e2e): stabilize metrics endpoint test and cert-manager setup

Fixes E2E test failures related to metrics endpoint verification and
cert-manager webhook initialization.

Changes:
- Fixed metrics server log message check to match actual controller-runtime
  output ("Starting metrics server" instead of "Serving metrics server")
- Added idempotent ClusterRoleBinding creation with cleanup before/after
- Added robust cert-manager webhook CA bundle injection wait to prevent
  "x509: certificate signed by unknown authority" errors during deployment
- Added cleanup of test resources (curl-metrics pod, ClusterRoleBinding)

The metrics port (8443) declaration was already added to the webhook patch
in a previous commit, making the port exposure explicit.

All 26 E2E tests now pass successfully.

* fix: fix make file

* chore(14-01): add mermaid diagram support to mkdocs.yml

- Configure pymdownx.superfences with custom_fences for mermaid rendering
- Enables state diagrams and flow charts in documentation

* docs(14-01): expand status field documentation with lifecycle and CI/CD integration

- Add observedGeneration to status fields table
- Add mermaid state diagram showing phase transitions
- Document all 5 condition types with all status/reason combinations
- Add comprehensive kubectl status checking commands
- Include CI/CD integration examples (GitHub Actions, GitLab CI, shell script)
- Document 2-minute pod startup grace period behavior
- Add notes about connectedWorkers approximation and SpecDrifted condition

* docs(14-02): add comprehensive FAQ documentation

- 18 questions across 6 sections (Test Lifecycle, Scaling, Debugging, Migration, Configuration, Observability)
- Covers immutability pattern with delete-and-recreate examples
- Debugging guidance for common issues (Pending, Failed, worker connectivity, ConfigMap errors)
- kubectl command examples for all troubleshooting scenarios
- Cross-references to how_does_it_work.md, api_reference.md, advanced_topics.md, migration.md

* docs(14-02): add security best practices guide

- RBAC permissions table matching actual Helm template (11 resource types documented)
- Namespace-scoped vs cluster-scoped guidance
- User RBAC examples (test creator and viewer roles)
- Three secret injection approaches with examples
- Secret rotation process documentation
- External Secrets Operator integration example (AWS Secrets Manager)
- Pod security defaults (restricted profile)
- NetworkPolicy example for master-worker isolation
- Image security guidance (private registries, scanning)

* docs(14-02): add FAQ and Security to MkDocs navigation

- Added Security entry after Metrics & Dashboards (operations area)
- Added FAQ entry after API Reference (user reference area)
- Preserved mermaid config from Plan 14-01
- Navigation structure remains valid

* feat(15-01): rewrite architecture documentation with Mermaid diagrams

- Add comprehensive v2 Go operator architecture explanation
- Include Mermaid flowchart showing resource creation and ownership
- Add state diagram for phase-based reconciliation lifecycle
- Document validation webhooks and what they catch
- Explain pod health monitoring with 2-minute grace period
- Cover leader election and HA behavior for production deployments
- Preserve existing Key Design Decisions and Demo sections
- Conversational tone for platform engineers and K8s admins

* feat(15-01): add production-ready LocustTest example CR

- Demonstrate resource limits for master (500m/512Mi req, 1000m/1Gi limit) and worker (250m/256Mi req, 500m/512Mi limit)
- Include pod anti-affinity to spread 10 workers across nodes for HA
- Enable autoquit with 60s timeout for clean CI/CD integration
- Rich inline comments explaining WHY each setting matters
- Production-appropriate labels and annotations for observability
- Uses specific image version (locustio/locust:2.43.1) for reproducibility
- Excludes security-sensitive examples (secrets) and optional features (OTel)

* feat(16-01): restructure navigation and add SEO metadata

- Group navigation into logical sections (Getting Started, User Guide, Architecture, Operations, Migration, Contributing)
- Update site_description to SEO-optimized production-ready description
- Add site_keywords field with target search terms
- Add schema-org.js to extra_javascript for structured data

* feat(16-02): add comparison page for operator vs alternatives

- Feature comparison table covering operator, Helm chart, and manual deployment
- Decision guide with admonitions for each approach
- Migration paths from Manual -> Operator, Helm -> Operator, v1 -> v2
- Links to getting_started.md and migration.md
- SEO-optimized front matter for comparison search queries

* feat(16-01): add Schema.org JSON-LD structured data script

- Create schema-org.js for SEO rich snippets
- Inject SoftwareApplication JSON-LD on DOMContentLoaded
- Include author, keywords, pricing (free), and repository metadata
- Enable enhanced search engine results with structured data

* feat(16-03): add comparison.md to mkdocs.yml navigation

- Add Comparison entry under Architecture section
- Makes comparison.md discoverable through logical navigation browsing

* docs: reorganize Demo section to top of how_does_it_work.md

Move Demo section from bottom to top for better visibility. No content changes.

* Docs/improve (#277)

* feat(17-02): add typography CSS custom properties and readability rules

- Add CSS custom properties for consistent typography scale
- Set content max-width to 75ch for readability (article only)
- Apply 1.7 line-height to body text, 1.5 to code, 1.3 to headings
- Configure code blocks with 14px font and 1.25rem padding
- Remove old conflicting h2/h3/p rules (replaced by custom properties)

* feat(17-01): enable instant navigation, prefetch, progress, and GitHub buttons

- Add navigation.instant for SPA-like behavior
- Add navigation.instant.prefetch for predictive loading on hover
- Add navigation.instant.progress for progress bar on slow connections
- Add content.action.edit and content.action.view for GitHub buttons
- Add site_url (required for instant navigation)
- Add edit_uri pointing to docs/improve branch

* feat(17-03): enhance comparison page with four alternatives and benchmarks

- Add Official Locust Operator (locustio/k8s-operator) to feature matrix
- Add k6 Operator (Grafana) as alternative testing tool
- Add Manual Deployment as fourth option
- Create "Why Choose This Operator" section with battle-tested data (65 stars, production since 2022, 20+ docs)
- Add Performance Benchmarks section with verified data (image: 75 vs 325 MB, memory: 64 vs 256 MB, startup: <1s vs ~60s)
- Expand Decision Guide to cover all four alternatives
- Use objective facts only throughout, acknowledge official operator respectfully
- Update metadata description to mention all alternatives

* feat(17-01): configure enhanced syntax highlighting and minify plugin

- Update pymdownx.highlight with anchor_linenums, line_spans, and pygments_lang_class
- Add mkdocs-minify-plugin with HTML/JS/CSS minification
- Configure htmlmin_opts for comment and space removal
- Enable cache_safe for cache busting
- Remove invalid site_keywords field (blocking build --strict)

* docs(18-02): add documentation URL changes section to migration guide

- Add "Documentation Structure Changes" section with Divio framework intro
- Include URL mapping table for all documentation pages
- Document current 2-level navigation hierarchy compliance
- Add "Automatic Redirects" notice for user reassurance
- Provide "What You Need to Do" instructions for bookmarks

* feat(18-01): add mkdocs-redirects plugin with Phase 19 redirect_maps

- Add redirects plugin between privacy and tags (before minify for correct processing order)
- Configure empty redirect_maps with Phase 19 activation comments
- Document mkdocs-redirects and mkdocs-minify-plugin as pip dependencies
- Build succeeds with redirects plugin loaded

* feat(18-01): add mkdocs-redirects and mkdocs-minify-plugin to CI workflows

- Update ci.yaml to install both plugins in docs-test job
- Update release.yaml to install both plugins in docs-release job
- Update docs-preview.yml to install both plugins in build-preview job
- CI builds will now succeed with redirects and minify plugins configured in mkdocs.yml

* feat(19-01): restructure navigation to Divio taxonomy

- Rename "Introduction" to "Home" (standard docs convention)
- Move comparison.md from Architecture to Getting Started (evaluator path)
- Promote api_reference.md from User Guide to top-level Reference section
- Rename "User Guide" to "How-To Guides" (Divio taxonomy)
- Rename "Architecture" to "Explanation" (Divio taxonomy)
- Move security.md from Operations to Explanation
- Move metrics_and_dashboards.md from Operations to Reference
- Move faq.md from User Guide to Reference
- Dissolve Operations section (content distributed)
- Shorten "Contributing & Development" to "Contributing"
- Maintain maximum 2-level nav depth

* feat(19-02): redesign homepage with hub model and persona entry points

- Add "Find Your Path" section with 3 persona-based cards
- Evaluator: Compare alternatives in <30 seconds (comparison.md)
- New User: Quick start in 5 minutes (getting_started.md)
- Power User: 1-click API access (api_reference.md)
- Update hero buttons to be more persona-targeted
- Preserve all existing sections (v2.0 features, capabilities, teams)

* docs(19-01): update migration guide with finalized Divio navigation

- Change admonition from "Upcoming" to "Navigation Restructured" (success)
- Update Current Navigation table to reflect new Divio structure
- Replace "Planned URL Changes" with "Navigation Changes" heading
- Change all "may move" conditional language to definitive statements
- Document all pages kept their URLs (nav-only changes)
- Update "Automatic Redirects" note to clarify no URLs changed
- Simplify "What You Need to Do" section (no URL changes)

* feat(19-03): enable navigation tabs and path, remove GitHub edit buttons

- Add navigation.tabs for top-level section tabs
- Add navigation.path for breadcrumb trail
- Remove content.action.edit and content.action.view (no edit_uri)
- Remove edit_uri from config (docs/improve branch no longer needed)
- Remove "Documentation Structure Changes" section from migration guide (nav finalized)

* feat(19-04): remove content width limit and simplify comparison page

- Remove --content-max-width CSS variable and .md-content__inner max-width rule
- Replace "65 GitHub stars" with "Battle-Tested" in comparison page
- Simplify "20+ documentation pages" to "Comprehensive documentation"
- Remove "Choose Official Locust Operator when..." decision guide section

* feat(20-01): create Quick Start and Tutorial landing pages

- Add 5-minute Quick Start tutorial with copy-paste ready commands
- Add Tutorial landing page with 3-tutorial learning path
- All commands use real values (httpbin.org, locustio/locust:2.20.0)
- Time commitments stated in titles

* feat(20-02): create CI/CD Integration tutorial (15 minutes)

- Complete GitHub Actions workflow with scheduled and manual triggers
- Unique test naming with timestamp for traceability
- Idempotent ConfigMap creation with dry-run + apply pattern
- Result collection and artifact upload
- Performance regression detection with configurable thresholds
- GitLab CI alternative in collapsed section
- Links to Tutorial 1 and Tutorial 3

* feat(20-01): create Tutorial 1 - Your First Load Test

- Add 10-minute tutorial with realistic e-commerce scenario
- 100 users across 5 workers, 5-minute test duration
- Inline code explanations and task weighting
- Step-by-step progression from Quick Start
- Links to CI/CD tutorial and how-to guides

* feat(20-04): create Observability domain how-to guides

- Configure OpenTelemetry integration: native OTel support with troubleshooting
- Monitor test status and health: phase tracking, conditions, CI/CD integration

* feat(20-02): create Production Deployment tutorial (20 minutes)

- Enhanced test script with authentication, weighted tasks, realistic behavior
- Resource sizing guidelines (master: 512Mi, workers: 256Mi, 20 replicas for 1000 users)
- Node affinity and tolerations for dedicated nodes
- OpenTelemetry native integration configuration
- Complete production-ready CR combining all features
- Monitoring and verification steps with status conditions
- Links to how-to guides and API reference

* feat(20-03): create How-To landing page and Configuration domain guides

- Create docs/how-to-guides/index.md with 4 domain sections
- Create 5 Configuration guides:
  - configure-resources.md: Resource limits, requests, per-CR config
  - use-private-registry.md: Image pull secrets and pull policies
  - mount-volumes.md: PVC, ConfigMap, Secret, EmptyDir volumes
  - configure-kafka.md: Two-level Kafka config (operator + per-test)
  - configure-ttl.md: Automatic job cleanup with TTL

All guides use:
- v2 API only (no v1 examples)
- Imperative, verb-first titles
- Complete end-to-end workflows with verification steps
- Inline code comments
- Realistic examples with actual values
- Cross-links to related guides

* feat(20-04): create Security domain how-to guides

- Inject secrets and configuration: 4 methods (ConfigMap, Secret, individual vars, file mounts) with verification
- Configure pod security settings: default contexts, RBAC best practices, network policies

* feat(20-03): create Scaling domain guides

- Create 4 Scaling guides:
  - scale-workers.md: Worker sizing formula, resource implications
  - use-node-affinity.md: Target specific nodes with labels and affinity
  - configure-tolerations.md: Schedule on tainted nodes with tolerations
  - use-node-selector.md: Simple label matching for node selection

All guides include:
- v2 API only
- Imperative, verb-first titles
- Complete end-to-end workflows
- Verification steps
- Troubleshooting sections
- Cross-links to related guides

* feat(20-05): transform features.md and advanced_topics.md for Divio taxonomy

- Transform features.md into scannable overview with links to how-to guides
- Remove all links to advanced_topics.md from feature cards
- Add subtitle: "Everything the Locust Kubernetes Operator can do. Click any feature to learn how."
- Replace advanced_topics.md with redirect page mapping old sections to new locations
- Redirect table includes 11 how-to guides across 4 domains
- Preserves URL (advanced_topics.md stays) so bookmarks still work

* feat(20-05): update mkdocs.yml with complete Divio taxonomy navigation

- Add Tutorials section with 3 tutorials (time commitments in titles)
- Add How-To Guides section with 4 domain subsections (13 total guides)
  - Configuration: 5 guides (resources, private registry, volumes, Kafka, TTL)
  - Observability: 2 guides (OpenTelemetry, test status)
  - Scaling: 4 guides (workers, node affinity, tolerations, node selector)
  - Security: 2 guides (inject secrets, pod security)
- Reorganize Reference section to include Features Overview
- Move Advanced Topics to Explanation section
- Add redirect from getting_started.md to getting_started/index.md
- Complete integration of all Phase 20 content into site navigation

* docs(20-06): fix How-To Guides landing page links for Observability and Security

- Replace advanced_topics.md link with observability/configure-opentelemetry.md
- Add observability/monitor-test-status.md (previously missing)
- Replace advanced_topics.md link with security/inject-secrets.md
- Add security/configure-pod-security.md (previously missing)

* docs(20-06): fix cross-references in Configuration guides

- configure-ttl.md: point to observability/configure-opentelemetry.md
- use-private-registry.md: point to security/inject-secrets.md
- configure-kafka.md: point to security/inject-secrets.md
- mount-volumes.md: point to security/inject-secrets.md

* docs(20-06): update all internal links to use getting_started/index.md

- Update comparison.md, faq.md, helm_deploy.md, index.md links
- Update metrics_and_dashboards.md and migration.md links
- Replace advanced_topics.md references with domain-specific how-to guides
- Remove deprecated getting_started.md file
- Fix API reference link in getting_started/index.md

* docs(20-06): simplify OpenTelemetry feature title on landing page

- Remove "Native" prefix from OpenTelemetry feature title for consistency

* chore(20-06): remove mkdocs-redirects plugin from CI workflows and documentation

- Remove mkdocs-redirects from ci.yaml, docs-preview.yml, and release.yaml
- Update mkdocs.yml comment to reflect removed dependency

* docs(20-06): remove GitLab CI examples from tutorials and guides

- Remove GitLab CI examples from api_reference.md and monitor-test-status.md
- Remove "Alternative: GitLab CI" section from ci-cd-integration.md tutorial
- Update tutorial description and tags to focus on GitHub Actions only

* feat(20-06): update documentation with corrected labels, endpoints, and security contexts

- Replace deprecated `locust.io/*` labels with `performance-test-*` labels throughout guides
- Add URL schemes to OpenTelemetry endpoints (http:// or https://) for SDK compatibility
- Update Kafka credentials to use secretName/key pattern instead of direct values
- Change default pod security context from "restricted" to "baseline" profile
- Remove `runAsNonRoot: true` from default security context documentation

* docs(helm): clarify webhook configuration as optional with use case

- Add "(optional)" to Webhook Configuration heading
- Document webhook purpose: conversion between old and new CRDs
- Preserve existing webhook parameter table

* docs(how-to-guides): fix technical inaccuracies and improve clarity across configuration and scaling guides

- Fix column header from EXPECTED to WORKERS in getting started output
- Add deprecation warning for operator-level Kafka configuration
- Document KAFKA_SASL_JAAS_CONFIG environment variable and kafka-python parameter naming
- Add ephemeral storage to resource configuration examples
- Document 3-level resource precedence chain (CR > role-specific > unified)
- Update worker memory recommendation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant