form3tech-oss
diff --git a/‎.golangci.yml‎
Lines changed: 4 additions & 1 deletion b/‎.golangci.yml‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎docs/CODEBASE_REVIEW.md‎
Lines changed: 351 additions & 0 deletions b/‎docs/CODEBASE_REVIEW.md‎
Lines changed: 351 additions & 0 deletions
@@ -82,9 +82,12 @@ linters:
       # Test files - relaxed rules for readability
       - linters: [dupword, lll, unparam, wrapcheck]
         path: _test\.go
+      # cobra.AddTemplateFunc modifies a global map; sync.Once is required
+      - linters: [gochecknoglobals]
+        path: internal/run/help\.go
       # t.Setenv is incompatible with t.Parallel (Go constraint)
       - linters: [paralleltest, tparallel]
-        path: options_settings_test\.go
+        path: options_test\.go
       - linters: [staticcheck]
         path: _test\.go
         text: ST1003
 
@@ -0,0 +1,351 @@
+# F1 Load Testing Library — Codebase Review
+
+**Review date:** March 11, 2025  
+**Scope:** v3 branch, pre-release preparation  
+**Focus:** Improvements suitable for v3 vs deferred
+
+---
+
+## 1. Package Structure
+
+### Current State
+
+| Location | Purpose |
+|----------|---------|
+| `pkg/f1` | Core API: `New`, `AddScenario`, `Run`, `Execute`, root cmd, profiling, completions |
+| `pkg/f1/f1testing` | `T`, `ScenarioFn`, `RunFn`, test lifecycle helpers |
+| `pkg/f1/scenarios` | Scenario registry, `Scenario` struct, `scenarios` CLI command |
+| `internal/run` | Run orchestration, test runner, result, views, scenario logger |
+| `internal/workers` | Pool manager, trigger pool, continuous pool, active scenario |
+| `internal/trigger` | Trigger builders (constant, staged, ramp, gaussian, users, file) |
+| `internal/trigger/api` | API types, iteration worker, distribution |
+| `internal/options` | RunOptions functional options |
+| `internal/log` | slog config, attrs, logger factory |
+| `internal/ui` | Output, printer, messages |
+| `internal/metrics` | Prometheus metrics |
+| `internal/envsettings` | Environment-based settings |
+| `internal/raterun` | Progress runner with adaptive intervals |
+| `internal/progress` | Stats, snapshot aggregation |
+| `internal/xcontext` | `Detach` for teardown context |
+| `internal/xtime` | `NanoTime` for timing |
+| `internal/triggerflags` | Shared flag constants |
+| `benchcmd` | Benchmark binary (main package) |
+
+**Organization:** Clear separation between public API (`pkg/`) and internal implementation. `internal/` is well-scoped; no leakage of internal types into public API.
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| **Medium** | `pkg/f1/scenarios` has no tests (0% coverage) | Scenario registry is used by all runs; add unit tests for `AddScenario`, `GetScenario`, `GetScenarioNames` |
+| **Low** | Consider `internal/triggerflags` → `internal/trigger/flags` | Co-locate with trigger package; minor refactor |
+| **Low** | `benchcmd` in repo root | Could move to `cmd/bench` for consistency with Go conventions |
+
+---
+
+## 2. API Design
+
+### Current State
+
+- **F1:** `New(opts ...Option)` with unified configuration: `WithSettings(Settings)`, `WithLogger`, `WithStaticMetrics`, and fine-grained overrides (`WithPrometheusPushGateway`, `WithPrometheusNamespace`, `WithPrometheusLabelID`, `WithLogFilePath`, `WithLogLevel(slog.Level)`, `WithLogFormat(LogFormat)`); `AddScenario(name, fn, ...ScenarioOption)`; `Run(ctx, args) error`; `Execute()`.
+- **Public types:** `Settings`, `PrometheusSettings`, `LoggingSettings`, `LogFormat` (strongly typed, no string-key API). `DefaultSettings()` loads from env vars.
+- **Scenarios:** `WithDescription`, `WithParameter`; `Scenario.RunFn` populated by framework during setup.
+- **f1testing.T:** `NewTWithOptions(scenarioName, ...TOption)`; `WithLogger`, `WithIteration`, `WithVUID`; `Error`/`Fatal`/`Log` with `args ...any` (testing.T compatible).
+- **Options:** All consolidated in `options.go`; types in `settings.go`. Precedence: programmatic options > env vars > defaults. `WithLogger` takes precedence over log level/format options.
+- **Design decisions:** No `Config` struct (bundles unrelated concerns), no `SettingsProvider` (lazy eval adds no benefit). `WithSettings(Settings{})` replaces `WithoutEnvSettings()` — explicit, no order-dependence. Typed logging APIs (`slog.Level`, `LogFormat`) instead of strings.
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| **Medium** | `AddScenario` returns `*F1` but `Run` is not called on `*F1` in chain | Fluent chaining is inconsistent; either document `AddScenario(...).Execute()` or drop return |
+| **Low** | `Scenario.RunFn` is populated by framework; not obvious from struct | V3_PLAN §8: Document that `RunFn` is set during setup; consider making it private or renaming to clarify |
+| **Low** | `GetScenarios` returns `*scenarios.Scenarios`; internal type exposed | Consider `ScenarioNames() []string` or keep `GetScenarios` for `scenarios` command |
+| **Deferred** | Scenario as interface | V3_PLAN: optional; current struct is sufficient |
+
+---
+
+## 3. Error Handling
+
+### Current State
+
+- **Wrapping:** `fmt.Errorf("... %w", err)` used consistently for wrapped errors.
+- **Return paths:** `Run(ctx, args) error` returns; `Execute()` exits on error.
+- **User-facing:** `result.Error()` for explicit errors; `result.Failed()` for failure conditions; `"load test failed - see log for details"` when failed but no explicit error.
+- **Result errors:** `errors.Join` used in `Result.Error()` for multiple errors.
+
+### Bug
+
+**File:** `pkg/f1/f1.go`, lines 166–174
+
+```go
+err = rootCmd.ExecuteContext(execCtx)
+profilingErr := f.profiling.stop()
+errs := errors.Join(err, profilingErr)
+
+if errs != nil {
+    return fmt.Errorf("command execution: %w", err)  // BUG: should be errs
+}
+```
+
+When both `err` and `profilingErr` are non-nil, the returned error wraps only `err`, losing profiling errors. **Fix:** use `errs` instead of `err`.
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| **High** | Fix `err` → `errs` in `f1.go` | Bug: profiling errors are dropped |
+| **Medium** | `run_cmd.go:211`: `errors.New("load test failed - see log for details")` loses context | Consider wrapping with `err` if `result.Error() != nil` |
+| **Low** | `scenario_logger.Open` returns `""` on error but displays ErrorMessage | Silently falls back; consider returning error or documenting behavior |
+| **Low** | `setEnvs`/`unsetEnvs` in file trigger: Display error but continue | May be intentional; document or consider failing fast |
+
+---
+
+## 4. Logging
+
+### Current State
+
+- **slog:** All logging uses `log/slog`; handlers support text and JSON.
+- **Structured attrs:** `log.IterationAttr`, `log.VUIDAttr`, `log.ScenarioAttr`, `log.ErrorAttr`, `log.StackTraceAttr`, `log.IterationStatsGroup`.
+- **Levels:** `Error` for failures; `Info` for progress; `Warn` for warnings; `Debug` via `F1_LOG_LEVEL`.
+- **T:** `Error`/`Fatal` → ERROR; `Log`/`Logf` → INFO.
+- **Output:** Interactive vs non-interactive; `Outputable` interface with `Print`/`Log`.
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| **Low** | `T.Error`/`T.Fatal` use `fmt.Sprintf`/`fmt.Sprintln` then string | Slog prefers key-value; consider `logger.Error("msg", slog.Any("args", args))` for complex cases |
+| **Low** | No trace-level logging in core | Could add for debugging iteration lifecycle |
+| **Deferred** | Log sampling for high-volume runs | Avoid log spam at high iteration rates |
+
+---
+
+## 5. Testing
+
+### Current State
+
+| Package | Coverage | Notes |
+|---------|----------|-------|
+| `internal/run/views` | 95.7% | Strong |
+| `internal/raterun` | 88.9% | Strong |
+| `pkg/f1/f1testing` | 82.5% | Good |
+| `internal/metrics` | 72.7% | Good |
+| `internal/trigger/api` | 67.1% | Adequate |
+| `pkg/f1` | 65.0% | Adequate |
+| `internal/run` | 58.7% | Adequate |
+| `internal/trigger/file` | 66.5% | Adequate |
+| `internal/trigger/staged` | 47.6% | Low |
+| `internal/trigger/gaussian` | 37.7% | Low |
+| `internal/xtime` | 100% | Excellent |
+
+**0% coverage:** `envsettings`, `gaussian`, `log`, `logutils`, `options`, `progress`, `trigger`, `constant`, `ramp`, `rate`, `users`, `triggerflags`, `ui`, `workers`, `xcontext`, `benchcmd`. ~~`scenarios`~~ — now has unit tests.
+
+**Patterns:** Stage-style tests (`RunTestStage`, `newF1Stage`); `given/when/then`; integration tests with fake Prometheus; signal tests for SIGTERM/SIGINT.
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| ~~**Medium**~~ | ~~Add unit tests for `scenarios`~~ Done | Low-hanging fruit; parsing logic is critical |
+| **Medium** | Add unit tests for `trigger/rate`, `trigger/constant` | Low-hanging fruit; parsing logic is critical |
+| **Medium** | Test `workers` package | Pool logic is core; currently only integration tests |
+| **Low** | Test `xcontext.Detach` | Simple but important for teardown |
+| **Low** | Test `log` package | Config, attrs |
+| **Deferred** | `envsettings`, `options` | Low risk; can defer |
+
+---
+
+## 6. Documentation
+
+### Current State
+
+- **doc.go:** `pkg/f1` and `pkg/f1/f1testing` have package docs with examples.
+- **README:** Usage, trigger modes, environment variables, output format.
+- **V3_PLAN.md:** Migration checklist, open issues.
+- **MIGRATION.md:** v2 → v3 migration guide.
+
+### Gaps (from V3_PLAN §11)
+
+| Item | Status |
+|------|--------|
+| Installation instructions | Missing |
+| Prometheus push gateway | Done (README: env vars + programmatic options + precedence) |
+| Output mechanisms (Prometheus, logs) | Partial |
+| Grafana dashboard example | Missing |
+| Screenshots/screencast | Missing |
+| `--wait-for-completion-timeout` and run flags | Not documented |
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| **High** | Add installation (build from source) | #245 |
+| ~~**Medium**~~ | ~~Document Prometheus env vars and programmatic options~~ Done | #4 |
+| **Medium** | Document run flags (e.g. `--wait-for-completion-timeout`) | #297 |
+| **Low** | Add Grafana dashboard example | #149 |
+| **Low** | Screenshots/screencast | #16 |
+
+---
+
+## 7. CLI
+
+### Current State
+
+- **Cobra:** Root cmd with `run`, `scenarios`, `completions`; run has trigger subcommands.
+- **Flags:** Grouped (Output, Duration & limits, Concurrency, Failure handling, Shutdown, Trigger options); kebab-case for long flags.
+- **Help:** `groupedFlagUsages` custom template; `Long` for short-flag meanings.
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| ~~**Medium**~~ | ~~`scenarios ls` has no `Short`/`Long`~~ Done | Help text could describe purpose |
+| ~~**Low**~~ | ~~`scenarios ls` prints to stdout~~ Done | Now uses `cmd.OutOrStdout()` for testability; unit tests verify ls output ordering and help text |
+| **Deferred** | `--interactive` for verbose output | #280 |
+
+---
+
+## 8. Concurrency
+
+### Current State
+
+- **Context:** `ctx` propagated from `Run` → `newSignalContext` → `execCtx` → `run.Do` → workers and triggers.
+- **Workers:** `TriggerPool` (rate-based) and `ContinuousPool` (users mode); both use `atomic.Bool` for stop to avoid mutex in hot path.
+- **Teardown:** `xcontext.Detach(ctx)` for teardown so it runs even when parent is cancelled.
+- **WaitForCompletion:** `poolManager.WaitForCompletion()` with timeout; `WaitGroup` for worker shutdown.
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| **Low** | `ContinuousPool` busy-loop on `!stopWorkers.Load()` | Acceptable; could add `time.Sleep` for very short iterations to reduce CPU |
+| **Deferred** | Worker pool size tuning | No evidence of issues; monitor |
+| **Deferred** | Context cancellation propagation in file trigger stages | `stageCtx` used; `ctx.Done()` checked between stages |
+
+---
+
+## 9. Code Quality
+
+### Duplication
+
+| Location | Pattern | Suggestion |
+|----------|---------|------------|
+| Trigger constructors | `flags.GetString(flagX)` + `fmt.Errorf("getting flag: %w", err)` repeated | Extract helper `getFlagString(flags, name) (string, error)` |
+| Trigger constructors | Similar flag parsing logic | Shared `parseFlags` helper for common flags |
+| `run_stage_test.go` | `build_trigger` switch | Acceptable; trigger-specific setup |
+
+### Complexity
+
+- **run_cmd.go `runCmdExecute`:** Long; ~120 lines. Could extract flag parsing into `parseRunFlags` and `buildRunOptions`.
+- **Result.Failed():** Multiple conditions; consider extracting to helper for readability.
+
+### Naming
+
+- **Minor:** `outputer` vs `output` — `outputer` used in `test_runner.go`; `output` elsewhere. Prefer `output` consistently.
+- **Minor:** `stages_worker.go` `setEnvs`/`unsetEnvs` typo: "unable set" → "unable to set"
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| **Medium** | Extract flag-getting helper | Reduce ~30+ repetitive `fmt.Errorf("getting flag: %w", err)` |
+| **Low** | Fix "unable set" / "unable unset" typos | User-facing messages |
+| **Low** | Extract `runCmdExecute` flag parsing | Improve readability |
+| **Deferred** | Refactor `Result.Failed()` | Low priority |
+
+---
+
+## 10. Missing Features / Gaps vs Plan
+
+### V3_PLAN Checklist
+
+| Item | Status |
+|------|--------|
+| Module path v3 | Done |
+| Remove deprecated APIs | Done |
+| Rename testing → f1testing | Done |
+| Context in scenario/run | Done |
+| Builder-style API | Done |
+| T Error/Fatal/Log compatible | Done |
+| Run(ctx, args) error | Done |
+| Options structs | Done |
+| Unified configuration API | Done (Settings types, WithSettings, typed WithLogLevel/WithLogFormat) |
+| Fluent chaining | Partial (AddScenario returns *F1) |
+| Document Scenario.RunFn | Not done |
+| Document env vars | Not done |
+| Custom flags for scenarios | Deferred (#246) |
+| Programmatic runner | Descoped |
+| Documentation | In progress |
+| Install instructions | Not done |
+| Grafana dashboard | Not done |
+| `--interactive` | Not done (#280) |
+| Staged chart transaction count | Not done (#38) |
+
+### Suggested Improvements
+
+| Priority | Issue | Rationale |
+|----------|-------|-----------|
+| **High** | Complete documentation items | Blocking for v3 release |
+| **Medium** | Document Scenario.RunFn population | V3_PLAN §8 |
+| **Medium** | Document env vars for config | #246 |
+| **Low** | Staged chart transaction count | #38 |
+| **Deferred** | Custom scenario flags | #246 |
+| **Deferred** | `--interactive` | #280 |
+
+---
+
+## Summary: Prioritized Action Items
+
+### Do in v3 (before release)
+
+1. **Fix bug:** `f1.go:173` — use `errs` instead of `err` to preserve joined errors.
+2. **Documentation:** Installation, Prometheus env vars, run flags (--wait-for-completion-timeout).
+3. **Typos:** "unable set" → "unable to set", "unable unset" → "unable to unset" in `stages_worker.go`.
+
+### Consider for v3
+
+4. ~~**Tests:** Add unit tests for `scenarios`.~~ Done
+5. **Tests:** Add unit tests for `trigger/rate`, `trigger/constant`.
+6. **Documentation:** Document Scenario.RunFn population, output mechanisms.
+7. **Refactor:** Extract flag-getting helper to reduce duplication.
+8. ~~**CLI:** Add `Short`/`Long` for `scenarios ls`.~~ Done
+
+### Defer to v3.1 or later
+
+8. **Custom scenario flags** (#246).
+9. **`--interactive` verbose output** (#280).
+10. **Staged chart transaction count** (#38).
+11. **Grafana dashboard example** (#149).
+12. **T Helper/Skip/TempDir** (V3_PLAN §5a).
+13. **Programmatic runner** (descoped).
+
+---
+
+## Appendix: File Structure
+
+```
+pkg/f1/
+  f1.go, settings.go, options.go, root_cmd.go, doc.go, profiling.go, completions.go
+  f1testing/
+    api.go, t.go, doc.go
+  scenarios/
+    scenario_builder.go, scenarios_cmd.go
+
+internal/
+  run/           # run orchestration
+  workers/       # pools, active scenario
+  trigger/       # constant, staged, ramp, gaussian, users, file
+  options/       # RunOptions
+  log/           # slog config, attrs
+  ui/            # output, printer
+  metrics/       # Prometheus
+  envsettings/   # env-based config
+  raterun/       # progress runner
+  progress/      # stats
+  xcontext/      # Detach
+  xtime/         # NanoTime
+  triggerflags/  # flag constants
+
+benchcmd/        # benchmark main
+```