Skip to content

feat!: Container infrastructure refactor (ADR-T-009)#857

Open
peer-cat wants to merge 21 commits intotorrust:developfrom
peer-cat:feat/rework_container_system
Open

feat!: Container infrastructure refactor (ADR-T-009)#857
peer-cat wants to merge 21 commits intotorrust:developfrom
peer-cat:feat/rework_container_system

Conversation

@peer-cat
Copy link
Copy Markdown
Contributor

Summary

This PR lands the full ADR-T-009 container-infrastructure rework on top of develop, plus a small auth correctness fix. It touches a hundred-ish files (~6.3k lines added, ~1.1k removed) across nine logical phases, all already broken into reviewable commits.

Why

The pre-existing container story bundled developer ergonomics, production posture, helper scripts, and configuration parsing into a single dev-only Compose stack and a monolithic bin toolbox. That made the production surface hard to reason about, hard to audit, and hard to ship. ADR-T-009 (now Implemented) records the design; this PR is the implementation.

Highlights

  • Workspace split. Carved seven focused crates out of the root binary:

    • torrust-index-config — pure parsing surface (the old src/config/v2/*).
    • torrust-index-config-probe — JSON-on-stdout probe consumed by the entry script.
    • torrust-index-health-check, torrust-index-auth-keypair — the former src/bin/* helpers, now real packages with their own tests.
    • torrust-index-entry-script, torrust-index-cli-common — shared helpers + integration tests for the shell entrypoint.
    • Helper crates follow a 0500 root:root, JSON-on-stdout, dep-closure-disciplined posture (ADR-T-009 §D5).
  • Compose split (§D1). compose.yaml is now production-shaped (no mailcatcher, no tty, bare ${VAR} for credentials); compose.override.yaml carries the dev-sandbox extras and is auto-loaded by make up-dev. make up-prod validates required env vars and runs the baseline only.

  • Lean release runtime + curated busybox (§D4). Release image gets only the applets it needs; /busybox/ lives on the debug image.

  • Entry script rewrite (§D3). entry_script_sh now drives off the JSON config probe, sources entry_script_lib_sh, and ships a canonical ENTRY_ENV_VARS manifest as the single source of truth for every env var it consults.

  • Schema-mandatory credentials (§D2). tracker.token and database.connect_url are required at the schema level; shipped TOMLs no longer carry placeholder secrets, and the Default impls for Settings / Tracker / Database are gone.

  • Vendored su-exec audit trail (§D8). AUDIT.md records provenance, rationale vs gosu/setpriv/su, and an append-only SHA-256-anchored audit log. CI enforces the SHA.

  • CI guards (§D9). New lints job in container.yaml: (1) compose.yaml stays free of SMTP wiring, (2) vendored su-exec.c matches the audit log, (3) every entry-script env var is documented in containers.md.

  • Auth fix. 90d6986 returns an error for missing users and tidies token responses (independent of the container work; folded in here because it sat next to the auth-keypair extraction).

Breaking changes (operator-facing)

These are called out explicitly in CHANGELOG.md:

  1. tracker.token and database.connect_url are mandatory — no more placeholder defaults.
  2. AUTH__*_PEM and AUTH__*_PATH are now mutually exclusive.
  3. TORRUST_INDEX_DATABASE_DRIVER is demoted to a first-boot TOML selector.
  4. USER_ID guard tightened from >= 1000 to "numeric, non-zero" (§D7).
  5. API_PORT / IMPORTER_API_PORT moved from build-time ARG to runtime ENV (§D6).

Docs

  • 009-container-infrastructure-refactor.md: standalone, self-contained ADR (the per-phase implementation plan was folded in and deleted in 37d86a9). All in-tree cross-references now point at the canonical §D1§D9 decision anchors.
  • containers.md: updated for the test-stage build gate, IMPORTER_API_PORT / TZ, and the env-var manifest contract.
  • README.md and lib.rs: docker run / podman run examples updated for the now-mandatory overrides; Compose-sandbox section added.

Verification

  • cargo check, cargo clippy, cargo test --workspace --all-targets --all-features — clean.
  • --no-default-features build — clean.
  • Docs + doc-tests build — clean.
  • End-to-end bring-up via make up-dev and make up-prod against podman-compose 1.5.0 / podman 5.8.2. The five latent bring-up bugs surfaced during this exercise (busybox addgroup, jq shared libs, etc.) were folded into Phase 8 rather than deferred — see 76f0fcd for the gory details.

Review suggestion

The commits are ordered to be reviewable in sequence; the per-phase commit messages carry the design rationale that doesn't fit in code comments. If you'd rather skim end-state, start with the consolidated ADR (009-container-infrastructure-refactor.md) and the CHANGELOG.md entry.

- Use destructuring in login/renewal responses to avoid cloning
- Return AuthError::UserNotFound instead of silently falling back to
  Guest when the user lookup fails in the permission extractor
- Validate login role against Role::from_str instead of hardcoded
  string comparison in test assertions
Security
--------
- Bind dev-only ports (MySQL, tracker, mailcatcher) to 127.0.0.1.
- Gate entry-script `set -x` behind `DEBUG=1` to stop credential leakage.
- Annotate compose credentials as DEV-ONLY with a TODO for secrets migration.

Correctness
-----------
- Fix MySQL healthcheck: use `$$MYSQL_ROOT_PASSWORD` instead of the
  non-existent `/run/secrets/db-password` Docker secret.
- Fix entry-script `USER_ID` guard: `&&` → `||` (was always-false when unset).
- Expose importer API port (3002) in Containerfile and map it in compose.
- Remove trailing whitespace on release HEALTHCHECK line.

Maintainability
---------------
- Upgrade base images from bookworm to trixie (cc-debian12 → cc-debian13).
- Pin `cargo-binstall` bootstrap to tag v1.18.1 (was `main` branch).
- Pin MySQL image to 8.0.45; replace deprecated
  `--default-authentication-plugin` with `--authentication-policy`.
- Drop redundant `--tests --benches --examples` from cargo-chef/nextest
  invocations (already covered by `--all-targets`).
- Rename `.dockerignore` → `.containerignore` for Podman compatibility.

Operational
-----------
- Add `restart: unless-stopped` to index and tracker compose services.
- Document debug-image healthcheck omission, busybox subset, and
  entry-script `DEBUG=1` flag in `docs/containers.md`.

Cleanup
-------
- Delete stale `contrib/dev-tools/container/build.sh` and `run.sh`.
- Remove stale `TORRUST_TRACKER_USER_UID` export from E2E scripts.
- Fix `USER_UID` → `USER_ID` typo in docs.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 68.73614% with 141 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.79%. Comparing base (80cbeec) to head (c54d2de).
⚠️ Report is 7 commits behind head on develop.

Files with missing lines Patch % Lines
...config-probe/src/bin/torrust-index-config-probe.rs 0.00% 32 Missing ⚠️
packages/index-cli-common/src/lib.rs 0.00% 21 Missing ⚠️
...auth-keypair/src/bin/torrust-index-auth-keypair.rs 0.00% 19 Missing ⚠️
packages/index-health-check/src/lib.rs 82.07% 11 Missing and 8 partials ⚠️
...health-check/src/bin/torrust-index-health-check.rs 0.00% 16 Missing ⚠️
packages/index-entry-script/src/lib.rs 63.41% 15 Missing ⚠️
packages/index-config-probe/src/lib.rs 91.07% 5 Missing ⚠️
packages/index-config/src/lib.rs 95.87% 2 Missing and 2 partials ⚠️
packages/index-auth-keypair/src/lib.rs 80.00% 0 Missing and 3 partials ⚠️
src/web/api/server/mod.rs 50.00% 2 Missing and 1 partial ⚠️
... and 2 more
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #857      +/-   ##
===========================================
+ Coverage    53.49%   54.79%   +1.29%     
===========================================
  Files          116      124       +8     
  Lines         6240     6439     +199     
  Branches      6240     6439     +199     
===========================================
+ Hits          3338     3528     +190     
- Misses        2816     2825       +9     
  Partials        86       86              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

peer-cat added 11 commits April 22, 2026 15:40
Supersede the tactical "Container Infrastructure Hardening"
ADR (Phases 1 & 2 already landed in a2ea512) with a
structural refactor ADR and a companion implementation plan.

The new ADR records nine decisions (D1–D9) derived from
nine principles (P1–P9), each traceable to one or more
diagnostic items (R1–R10) catalogued in the appendix:

  D1  Split compose into baseline + override
  D2  Strip credentials from defaults; mandatory connect_url
  D3  Single source of truth for auth-key paths
  D4  Two parallel runtime bases; root-only utilities
  D5  Helper binaries as separate workspace crates
  D6  Drop build-time ARG for runtime concerns
  D7  Refuse-if-root entry-script guard
  D8  Vendored su-exec gains an internal audit record
  D9  Build hygiene and test-stage coupling

The implementation plan details nine phases with dependency
ordering, Containerfile snippets, shell fragments, and
executable acceptance scripts for every criterion.

AGENTS.md gains a POSIX-paths rule and a note that the
helper crates (index-health-check, index-auth-keypair,
index-config, index-config-probe) share the T- prefix.
ADR-T-009 Phase 1.

Containerfile:
- Remove `ARG API_PORT` / `ARG IMPORTER_API_PORT`. The ports
  are pure runtime concerns; exposing them as build args let
  callers bake non-default ports into EXPOSE metadata while
  the healthcheck and listener still resolved from ENV. Keep
  `ENV API_PORT=3001` / `ENV IMPORTER_API_PORT=3002` with
  literal values so the runtime contract is unchanged.

.containerignore:
- Exclude `/adr/` and `/docs/` from the build context. These
  directories are documentation-only and are never referenced
  by any build stage.
…-009 §2)

ADR-T-009 Phase 2.

New crates
----------
- `torrust-index-cli-common` (packages/index-cli-common/):
  shared P9 scaffolding — `refuse_if_stdout_is_tty` (exit 2),
  `init_json_tracing`, `emit<T>`, and `BaseArgs` (--debug).

- `torrust-index-health-check` (packages/index-health-check/):
  replaces `src/bin/health_check.rs`. Rewritten with
  `std::net::TcpStream` — no reqwest, no tokio, no TLS stack.
  JSON stdout on success per P9. Tests cover 200 OK, non-2xx,
  connection refused, read timeout, and malformed status line.

- `torrust-index-auth-keypair` (packages/index-auth-keypair/):
  replaces `src/bin/generate_auth_keypair.rs`. Output changes
  from raw PEM blocks to a JSON object with `private_key_pem`
  and `public_key_pem` fields. TTY refusal unified to exit 2
  (was exit 1). Tests cover JSON round-trip, PEM parsing, and
  key uniqueness across calls.

Entry script
------------
- Keygen invocation renamed: `torrust-generate-auth-keypair`
  → `torrust-index-auth-keypair`.
- PEM extraction migrated from `sed` to `jq -r`.
- Public key mode tightened from 0440 to 0400 (owner read-only,
  matching the private key).

Containerfile
-------------
- Binary names updated in `cp -l` lines for both test stages.
- `jq_donor` stage added (pristine rust:slim-trixie base) so
  the runtime image gains `/usr/bin/jq` for entry-script JSON
  consumption. Root-only (0500).
- HEALTHCHECK updated to `torrust-index-health-check`.

Removed
-------
- `src/bin/health_check.rs` (replaced by index-health-check).
- `src/bin/generate_auth_keypair.rs` (replaced by index-auth-keypair).
- `[[bin]] torrust-generate-auth-keypair` from root Cargo.toml.
ADR-T-009 Phase 3.

New crate
---------
- `torrust-index-config` (packages/index-config/): leaf crate
  carrying the entire schema + loader. Public surface includes
  `Settings`, the `v2::*` modules, `validator`, `Info`, `Error`,
  `load_settings`, the `CONFIG_OVERRIDE_*` / `ENV_VAR_CONFIG_TOML*`
  constants, and the permission value types (`Role`, `Action`,
  `Effect`, `PermissionOverride`, `RoleParseError`).

  Non-stdlib deps are limited to `serde`, `serde_json`, `serde_with`,
  `figment`, `toml`, `url`, `camino`, `derive_more`, `thiserror`,
  `tracing`, and `lettre` (default-features = false; only `builder`
  + `serde`). No `tokio`, `reqwest`, `sqlx`, `hyper`, `rustls`,
  `native-tls`, or `openssl` in its dep closure — verified by the
  `cargo tree` acceptance check from the implementation plan.

  Tests cover the loader, metadata defaults, permission overrides,
  TOML/JSON round-trips, redaction, schema quirks, and every
  shipped `share/default/config/index.*.toml` sample.

Root crate
----------
- `src/config/mod.rs` is now a thin shim: `pub use
  torrust_index_config::*;` plus the runtime `Configuration`
  wrapper holding `RwLock<Settings>`. Existing
  `use crate::config::*;` call sites continue to compile.
- `src/services/authorization.rs` re-exports `Role`, `Action`,
  `Effect`, `PermissionOverride`, `RoleParseError` from
  `torrust_index_config::permissions` for backwards compatibility.
  The `Permissions` trait and runtime `PermissionMatrix` stay
  in the root crate.

BREAKING — TLS spelling fix
---------------------------
The `[net.tsl]` typo is corrected to `[net.tls]` in operator
TOMLs, and the settings JSON API response field changes from
`"tsl"` to `"tls"`. Done as a clean break (no compatibility
alias) alongside the crate extraction. The shipped
`index.development.sqlite3.toml` sample and the `src/app.rs` /
`src/web/api/server/` call sites are updated accordingly.

Docs
----
- ADR-T-009 implementation plan gains a Phase Status table
  (Phases 1–3 marked Done) and §3 is updated to reflect the
  actual dep set (`serde_json` + `lettre` are in; rationale
  explained).
- ADR-T-009 §C dependency table updated for the probe crate
  (Phase 6) to match the now-confirmed `index-config` closure.
- CHANGELOG entries added for the new crate, the config move,
  the permission re-export, and the TLS rename.
- AGENTS.md "Replacing a File" recipe rewritten to use
  `mv tmp file` (atomic) instead of `rm` + recreate.
…9 Phase 4)

ADR-T-009 Phase 4.

Containerfile
-------------
- Replace the single `runtime` stage with two distinct bases:
  `runtime_release` (lean `gcr.io/distroless/cc-debian13`) and
  `runtime_debug` (the `:debug` variant). Both consume a shared
  `runtime_assets` scratch layer that carries a seeded `/etc`
  (`passwd`, `group`, empty `profile`), `su-exec`, and the entry
  script — keeping the asset surface base-agnostic.
- Release base ships a single `/bin/busybox` binary at
  `0700 root:root` plus a curated symlink set (`sh`, `adduser`,
  `install`, `mkdir`, `dirname`, `chown`, `chmod`, `tr`,
  `mktemp`, `cat`, `printf`, `rm`, `echo`, `grep`). The
  unprivileged `torrust` user gets `EACCES` on the entire
  busybox tree after privilege drop. No `/busybox/` directory
  exists in the release image, so absolute-path invocations
  cannot bypass the curated subset.
- Debug base keeps the upstream `/busybox/` tree intact and
  puts it on `PATH`, preserving full applet access for
  interactive debugging as the application user.
- Add three preflight stages aggregated through a
  `preflight_gate` so BuildKit cannot prune them: `busybox_donor`
  (asserts the donor still ships a real busybox ELF and that
  `install` supports `-D`), `etc_seed` (generates the minimal
  `/etc` layout), and `adduser_preflight` (asserts `adduser -D`
  works against passwd+group without `/etc/shadow`). UID `59999`
  is used in the adduser check because busybox `adduser`
  rejects UIDs ≥ 60000.
- Phase 4 per-binary modes: `torrust-index` stays
  world-executable (`0755`); `torrust-index-auth-keypair` and
  `torrust-index-health-check` tighten to `0500 root:root`
  (same posture as busybox, su-exec, jq). The healthcheck
  binary runs from `HEALTHCHECK` as root, so 0500 suffices.
- Pin `PATH` on both runtime bases so a future base-image
  reshuffle cannot silently break the entry script's bare-name
  lookups.
- Quote the URL arguments in the `HEALTHCHECK` command so
  variable expansion is robust.
- Inline the per-image `ENV`/`EXPOSE`/`VOLUME` declarations on
  the final `debug` and `release` stages (previously inherited
  from the shared `runtime` stage that no longer exists).

Entry script (`share/container/entry_script_sh`)
-----------------------------------------------
- D7: replace the `USER_ID >= 1000` rule with "numeric and not
  zero". The previous rule rejected legitimate configurations
  (rootless Podman with subuid remapping, low-UID CI runners,
  BSD-derived hosts) without stating intent. The actual
  property we want is "do not run as root".
- Switch to busybox `adduser` short-option form
  (`-D -s /bin/sh -u`) so the same invocation works on both
  runtime bases. The lean distroless `cc-debian13` ships
  `/etc/passwd` and `/etc/group` but not `/etc/shadow`; `-D`
  honours that. Busybox `adduser -s` does not consult
  `/etc/shells`, which the lean base also lacks.
- Also `mkdir -p /var/log/torrust/index/` so the log directory
  exists before the `chown -R` / `chmod -R` pass.

Docs
----
- `docs/containers.md`: rewrite the "Available Shell Commands"
  section to describe the two distinct shell footprints
  (curated release subset vs. full debug applet tree) and
  document the `docker exec -u root … sh` break-glass
  procedure for the release image. Document the new `USER_ID`
  contract (non-negative, non-zero) on both the prose
  description and the env-var reference table.
- `adr/009-implementation-plan.md`: mark Phase 4 Done; align
  the adduser preflight UID with the Containerfile (59999)
  and explain the busybox 0..60000 constraint.
…chema level (ADR-T-009 §D2)

Shipped defaults must not carry credentials or environment-coupled
values. The schema now treats `tracker.token` and `database.connect_url`
as mandatory: there is no `impl Default` for `Settings`, `Tracker`, or
`Database`, and a missing field fails deserialisation with a precise
serde `missing field` error.

- Drop `#[serde(default)]` and the `Default` impls from `Settings`,
  `Tracker`, and `Database`; document the new contract on each type.
- Strip `token`, `connect_url`, `[mail.smtp]`, and auth key paths from
  every `share/default/config/index.*.toml` (and the two tracker
  samples). Operators supply them at runtime via env-var overrides or
  a side-loaded TOML.
- Add `torrust_index_config::test_helpers::{PLACEHOLDER_TOML,
  placeholder_settings}` as the single shared fixture for tests that
  used to lean on `Settings::default()`. All crate, integration, and
  root-level tests now pivot through it.
- Rework `tests/shipped_samples.rs`: the shipped TOMLs are now asserted
  to be syntactically valid, credential-free, and to be *rejected* by
  the loader until the operator supplies the mandatory secrets.
- Remove `impl Default for Configuration` in the root crate; replace
  with a test-only `Configuration::for_tests()` that loads the shared
  placeholder TOML. Update `tests/jwt.rs`, `tests/services/settings.rs`,
  `tests/web/require_permission.rs`, and `tests/config/mod.rs` to use it.
- Update the bootstrap tests (`src/tests/bootstrap/config.rs`,
  `tests/e2e/config.rs`) to mirror the operator workflow by injecting
  `TORRUST_INDEX_CONFIG_OVERRIDE_TRACKER__TOKEN` and
  `..._DATABASE__CONNECT_URL` through `figment::Jail`.
- Refresh `adr/009-implementation-plan.md` and `docs/containers.md`
  to describe the new defaults-vs-secrets split.
Phase 6 of ADR-T-009. Bridges the entry script (POSIX shell, can't
parse TOML or env-var overrides) and the application's actual
configuration loader by emitting the container-relevant subset of
the resolved `Settings` as a single JSON object on stdout. The
shell pipes it into `jq`; the future Rust entry binary will
deserialise the same JSON via `serde_json`.

New crate
---------
- `torrust-index-config-probe` (packages/index-config-probe/):
  binary + small `pub` library. Depends only on
  `torrust-index-cli-common`, `torrust-index-config`, `clap`,
  `serde`/`serde_json`, `url`, `percent-encoding`, and `tracing`.

  Wire format (`schema = 1`):
  - `database.driver`: `Driver` enum (`sqlite` | `mysql`),
    enforcing the dispatch table at the type level rather than
    via stringly-typed matches. Mirrors the schemes recognised
    by the application's own `databases::database::get_driver`;
    `mariadb` is no longer claimed as supported.
  - `database.path`: sqlite file path (absolute, relative, or
    `:memory:`) percent-decoded into UTF-8; `null` for non-sqlite.
  - `auth.{private,public}_key.{pem_set,path_set,source,path}`:
    raw presence (collapsing `Some("")` and `None` into `false`
    at the container boundary) plus the PEM-overrides-PATH winner.

  Exit codes (per implementation plan §6.1):
  - `0` success; `1` unhandled panic (mapped from Rust's default
    101 by a `std::panic::set_hook` installed in `main`) or
    stdout write failure; `2` stdout-is-a-TTY refusal or clap
    argv-parse failure; `3` config-load failure; `4` empty
    `tracker.token`; `5` unrecognised database scheme.

Supporting changes
------------------
- `torrust-index-config`: add `Info::from_env`, the JSON-safe
  sibling of `Info::new` that reads the same env vars but skips
  the diagnostic `println!`s that would corrupt the probe's
  stdout-only contract. Refactor `Info::new` to delegate to it.
- `torrust-index-config`: add `pub const DEFAULT_CONFIG_TOML_PATH`
  so the application bootstrap and helper binaries share one
  source of truth for the default config-TOML location.
  `src/bootstrap/config.rs` now re-exports it under the existing
  `DEFAULT_PATH_CONFIG` name.
- `torrust-index-config::v2::tracker::ApiToken`: add
  `is_empty()` so the probe's empty-token gate does not need to
  reach into the type's byte representation.

Tests
-----
- Unit tests cover every row of the sqlite-spelling table, the
  auth-key matrix (PEM-only, PATH-only, none, both), missing
  `connect_url`, empty `tracker.token`, unsupported `mariadb` and
  `postgres` schemes, JSON round-tripping, and the lowercase wire
  format of both enums (`AuthKeySource`, `Driver`).
- `tests/binary.rs` exercises the compiled binary end-to-end:
  happy-path JSON shape, exit 3 on missing `connect_url`, exit 4
  on empty token, exit 5 on `postgres`, and pins clap's exit-2
  behaviour on an unknown flag.

Docs
----
- `adr/009-implementation-plan.md`: mark Phase 6 Done; expand §6.1
  to describe the `Driver` enum, the empty-string-equals-absent
  collapsing rule, the panic-hook → exit-1 mapping, the UTF-8
  assumption on sqlite paths, and the `Info::from_env` /
  `DEFAULT_CONFIG_TOML_PATH` single-source-of-truth wiring.
  Update the dispatch-table row for `mariadb://` to `exit 5`.
- `adr/009-container-infrastructure-refactor.md`: minor §C dep-row
  fix for the probe crate.
…hase 7)

Phase 7 of ADR-T-009. Rewrites the container entry script
to consume the Phase-6 config probe as the single source of
truth for runtime configuration, extracts its pure helpers
into a sourced POSIX shell library, and adds a host-side
test crate that drives those helpers through `sh` so CI
catches regressions without spinning up a container.

Entry script
------------
- `share/container/entry_script_sh`: rewritten under
  `set -eu` (POSIX errexit + nounset), turning a class of
  silent-misconfiguration bugs into loud, actionable startup
  failures. Replaces the ad-hoc `cmp_lc` / `to_lc` shell
  plumbing with `case` dispatches and pulls the helper
  functions in via `. /usr/local/lib/torrust/entry_script_lib_sh`.
- Boot order now matches ADR-T-009 §7: validate `USER_ID`,
  install the default TOML selected by
  `TORRUST_INDEX_DATABASE_DRIVER` (now a *first-boot
  selector only* — runtime database decisions come from the
  probe), invoke `/usr/bin/torrust-index-config-probe`, gate
  on the schema version, then dispatch on the probe's
  `database.driver` / auth-key fields.
- Auth-key handling: enforces ADR-T-009 §7.1's three
  invariants (PEM/PATH mutual exclusion within a key, pair
  completeness across both keys, cross-pair source
  consistency) against the probe's `pem_set` / `path_set` /
  `source` fields. When neither key is configured anywhere
  (probe `source=none`), the script materialises the
  container defaults under `/etc/torrust/index/auth/` *and*
  exports the matching `..._AUTH__*_PATH` overrides so the
  application sees the same paths — making the entry script
  the single source of truth for those defaults (no
  duplicated constant to drift apart).
- SQLite seeding: dispatches off the probe's
  `database.driver`. The seed template is now the canonical
  `index.sqlite3.db` filename, and seeding only fires when
  the resolved path lives under one of the managed volumes
  (`/etc/torrust/index/`, `/var/lib/torrust/index/`,
  `/var/log/torrust/index/`); paths outside those roots
  must be pre-created by the operator. MySQL/MariaDB is a
  no-op since the application connects directly.

Sourced shell library
---------------------
- `share/container/entry_script_lib_sh` (new, mode
  `0444 root:root`): houses the pure helpers `inst`,
  `key_configured`, `validate_auth_keys`, and `seed_sqlite`.
  No top-level side effects — only function definitions —
  so it is safe to source both inside the container and
  inside the host-side test harness. World-readable,
  root-owned, not executable on its own (sourced, not
  exec'd).
- Both runtime stages in `Containerfile` (`runtime_assets`
  feeding the release base, and `runtime_debug` directly)
  copy the library to `/usr/local/lib/torrust/`.

Container image
---------------
- `Containerfile`: copies the new
  `torrust-index-config-probe` binary into both debug and
  release images at `/app/bin/`, with the same root-only
  posture (`0500 root:root`) as `auth-keypair` / `jq` /
  `su-exec`. The probe runs only in the entry script's
  pre-`su-exec` phase; the unprivileged `torrust` user has
  no access to it after privilege drop.

Test crate
----------
- `packages/index-entry-script/` (new): `[lib]` workspace
  member that ships **no runtime code** — its `tests/`
  invoke `sh` as a subprocess against
  `entry_script_lib_sh` and assert exit codes / stderr
  contents. Runs as part of `cargo test --workspace`.
  - `tests/validate_auth_keys.rs`: every branch of §7.1's
    three invariants (mutual exclusion within a key, pair
    completeness, cross-pair source consistency).
  - `tests/seed_sqlite.rs`: every §7.2 outcome that does
    not require root or writing to a managed volume
    (`:memory:` skip, relative-path warn,
    absolute-non-empty untouched, outside-volumes error,
    empty-path probe-bug error). The "missing-under-volume
    seeded" outcome and the end-to-end §7.1 case-3 export
    against the real generator both require root and the
    container's `torrust` user; they belong in the
    container e2e suite (Phase 8 / 9).
  - The `lib_path()` / `run_sh()` / `run_sh_with_args()`
    helpers are `#[doc(hidden)] pub` so the integration
    tests can reach them while keeping the public surface
    empty.

Docs
----
- `adr/009-implementation-plan.md`: mark Phase 7 Done; add
  §7.5 documenting the sourced-library extraction and the
  host-side test crate as a strict improvement on the
  inline-function form sketched in §7.1 / §7.2 (the §7.1
  "Verification" note explicitly asked for an integration
  test covering the auth-key invariants). Update the
  sqlite seed template path to the canonical
  `index.sqlite3.db` filename.
- `docs/containers.md`: new "Entry Script Contract"
  section walking through the seven-step boot sequence,
  the runtime `jq` dependency, the sourced shell library,
  and the required-overrides-between-phases note pointing
  at ADR-T-009 §D2's mandatory `tracker.token` /
  `database.connect_url`. Updates the "Basic Run" examples
  to supply both required env vars (the previous
  no-argument invocation no longer starts under the §D2
  schema). Updates the auto-populated SQLite path and the
  `TORRUST_INDEX_DATABASE_DRIVER` description to reflect
  its narrowed first-boot-selector scope.
Phase 8 of ADR-T-009. Splits the single development-only
`compose.yaml` into a production-shaped baseline plus a
dev-sandbox override, wires both behind a top-level
`Makefile`, and folds in five latent bugs surfaced by the
end-to-end `make up-dev` / `make up-prod` bring-up against
`podman-compose 1.5.0` / `podman 5.8.2`.

Compose split
-------------
- `compose.yaml`: production-shaped baseline. Drops the
  `mailcatcher` sidecar and the `tty: true` allocations.
  Credentials and the environment-coupled mail SMTP
  server are referenced as bare `${VAR}` (no default, no
  `:?required`); validation is deferred to the wrapper
  and the in-container config probe (defence in depth,
  per ADR-T-009 §8.1). Operator selectors with a sensible
  cross-environment default (`TORRUST_INDEX_DATABASE`,
  `TORRUST_INDEX_DATABASE_DRIVER`, …) keep their
  `${VAR:-default}` form. Switches `depends_on` on `index`
  and `tracker` to long-form so an override can additively
  re-attach `mailcatcher` rather than silently replace
  the dependency list.
- `compose.override.yaml` (new): dev sandbox extras —
  `mailcatcher` sidecar (re-attached to `index` via
  long-form `depends_on`), `tty: true` on `index` and
  `tracker`, and permissive `${VAR:-default}` credential
  defaults so a plain `docker compose up` works with no
  operator intervention. Auto-loaded by Compose v2 and by
  `make up-dev`; excluded by `make up-prod` via explicit
  `--file compose.yaml`.
- `Makefile` (new): two targets, `up-dev` (plain
  `docker compose up`) and `up-prod` (validates required
  env vars via POSIX `sh -uc ': "${VAR:?required}"'`,
  then `docker compose --file compose.yaml up -d --wait`).
  `COMPOSE_FILE` is overridable for acceptance tests.
  The MySQL credential check is gated behind a
  `grep mysql:` of the compose file so a baseline without
  a local mysql sidecar does not require `MYSQL_ROOT_PASSWORD`.

Bring-up fixes (folded in rather than deferred)
-----------------------------------------------
1. **`addgroup` missing from the curated busybox surface.**
   Distroless `cc-debian13` ships `/etc/passwd` and
   `/etc/group`, but busybox `adduser -D -u UID NAME` only
   writes `/etc/passwd`; `getgrnam("torrust")` then fails
   and the entry script's `install -g torrust …` step dies
   with `unknown group torrust`. Add `addgroup` to the
   curated symlink loop in the release runtime base (and
   to the preflight gate's symlink loop, kept in sync per
   ADR-T-009 §4.4). Entry script now does
   `addgroup -g "$USER_ID" torrust` followed by
   `adduser -D -s /bin/sh -u "$USER_ID" -G torrust torrust`,
   both guarded with idempotent `grep` checks of
   `/etc/{group,passwd}` so a container restart is a no-op
   under `set -e`.
2. **`jq` shipped without its shared libs.** The
   `jq_donor` stage installed jq via `apt-get` but the
   runtime stages copied only `/usr/bin/jq`; the binary
   then aborted with `libjq.so.1: cannot open shared
   object file` inside the lean `cc-debian13` runtime.
   Copy `libjq.so.1` and `libonig.so.5` from the donor
   alongside the binary into both runtime stages. Add an
   `ldd`-based allow-list assertion to the `jq_donor`
   stage (`libc.so.6 libjq.so.1 libm.so.6 libonig.so.5
   ld-linux-x86-64.so.2 linux-vdso.so.1`) so a future
   donor-base upgrade that drags in a new transitive dep
   fails the build instead of silently producing a broken
   image. The parser strips path prefixes from `ldd`'s
   first column so absolute paths and bare sonames are
   treated identically.
3. **Empty-string env vars treated as inline config TOML.**
   `Info::from_env` called `env::var(…).ok()` directly,
   which returns `Some("")` for an exported-but-unset
   variable. The previous compose baseline's
   `TORRUST_INDEX_CONFIG_TOML=${TORRUST_INDEX_CONFIG_TOML}`
   propagated an empty string into the container, which
   the loader treated as a zero-byte TOML and rejected
   with `Missing mandatory option: logging.threshold`.
   `Info::from_env` now filters empty strings out of both
   `TORRUST_INDEX_CONFIG_TOML` and
   `TORRUST_INDEX_CONFIG_TOML_PATH`. `compose.yaml` uses
   Compose's bare-name pass-through form
   (`- TORRUST_INDEX_CONFIG_TOML`, no `=`) for those two
   optional inline-TOML envs so an unset host var is
   omitted entirely rather than forwarded as empty.
4. **Test binaries baking compile-time paths.**
   `packages/index-config-probe/tests/binary.rs` and
   `packages/index-entry-script/tests/seed_sqlite.rs`
   resolved their subjects via `env!("CARGO_BIN_EXE_…")`
   and `CARGO_MANIFEST_DIR`; both bake the build-time
   path into the test binary and break under
   cargo-nextest's archive → `--extract-to` →
   `--target-dir-remap` flow used by the container `test`
   stage. `binary.rs` is rewritten to drive the contracts
   through the library API (same model as
   `packages/index-config/tests/shipped_samples.rs`):
   exit-3 / 4 / 5 are pinned by exercising
   `load_settings` / `probe`, and exit-2 is pinned by
   handing clap a mirror of the binary's `Args` and
   asserting `Err::exit_code() == 2`. The entry-script
   test crate embeds `entry_script_lib_sh` at compile
   time via `include_str!` and materialises it to a
   `tempfile` per `run_sh` call; `tempfile` accordingly
   moves from `[dev-dependencies]` to `[dependencies]`
   since `run_sh` / `run_sh_with_args` are
   `#[doc(hidden)] pub` library items.
5. **Unqualified Docker Hub image names.**
   `mysql:8.0.45`, `dockage/mailcatcher:0.8.2`, and
   `torrust/tracker:develop` all relied on a
   short-name-resolution prompt that podman cannot honour
   without a TTY (and that operators should not be relying
   on regardless). Fully qualify all three as
   `docker.io/...` in `compose.yaml` and
   `compose.override.yaml`. Portable across docker and
   podman; immune to per-host
   `unqualified-search-registries` configuration.
6. **`USER_ID` missing from `_validate-prod-env`.**
   `compose.yaml` references `${USER_ID}` with no default
   (it is an operator-supplied numeric selector, neither
   a credential nor environment-coupled). Without
   wrapper-side validation, an unset `USER_ID` propagated
   to the entry script's `adduser -u ""` call and failed
   deep inside container start. Add `USER_ID` to the
   `_validate-prod-env` required list.

Test fixtures
-------------
- The three `contrib/dev-tools/container/e2e/.../e2e-env-up.sh`
  scripts now export `TORRUST_INDEX_CONFIG_OVERRIDE_DATABASE__CONNECT_URL`
  alongside the existing `..._TRACKER__TOKEN` override, so
  the e2e suite satisfies the §D2 mandatory-fields contract
  the same way `make up-prod` does.

Docs
----
- `adr/009-implementation-plan.md`: mark Phase 8 Done; add
  §8.6 documenting the six bring-up fixes and §9.1.3
  scheduling the `mailcatcher`-on-`compose.yaml` CI lint
  prescribed by §8.1; align the §4.4 / §9 prose and the
  curated-applets table with the new `addgroup` symlink
  and the §8.1 `USER_ID` validation.
- `docs/containers.md`: new "Compose Split" section
  walking through the baseline / override split, the
  `make up-dev` / `make up-prod` invocation paths, the
  required env vars, and the defence-in-depth contract
  with the in-container config probe. Update the curated
  busybox applet list to include `addgroup` and refresh
  the auth-key bootstrap pointer to the new override
  file.
Closes out ADR-T-009 by landing the documentation and CI-audit
work that the earlier phases deferred. No runtime behaviour
changes; the diff is operator-facing docs plus three new
container-infra lint jobs.

Highlights:

- CHANGELOG: consolidate the Phase 1–9 story under a single
  ADR-T-009 entry covering the lean release / debug split,
  the three extracted helper crates (health-check,
  auth-keypair, config-probe), the Compose split, and the
  vendored su-exec audit. Also document the three breaking
  changes operators need to know about (mandatory
  database.connect_url and tracker.token, mutually-exclusive
  AUTH__*_PEM / AUTH__*_PATH pairs, and the demotion of
  TORRUST_INDEX_DATABASE_DRIVER to a first-boot TOML selector).
- README: replace the bare `docker run` / `podman run`
  one-liners with the now-required override pair, and add
  a Compose-sandbox section pointing at `make up-dev` /
  `make up-prod`. Cross-link ADR-T-009 from the docs index.
- docs/containers.md: document the test-stage build gate
  (no --skip-tests escape hatch, by design), the new
  IMPORTER_API_PORT and TZ env vars, and the canonical
  entry-script env-var manifest contract.
- share/container/entry_script_sh: add the
  `ENTRY_ENV_VARS` / `END_ENTRY_ENV_VARS` manifest block
  as the single source of truth for every env var the
  script consults (including dynamically constructed
  AUTH__*_{PEM,PATH} names a naive grep would miss).
- contrib/dev-tools/su-exec/AUDIT.md: new file recording
  provenance, choice rationale (vs gosu / setpriv / su),
  re-audit triggers (file-change + CVE, deliberately not
  calendar-based), and an append-only audit log anchored
  by SHA-256.
- .github/workflows/container.yaml: new `lints` job that
  the existing `test` job now depends on, with three
  guards — (1) compose.yaml stays free of mailcatcher /
  SMTP wiring (comments stripped before grepping so the
  explanatory header doesn't trip the audit),
  (2) su-exec.c SHA-256 matches the most recent AUDIT.md
  entry, and (3) every env var in the entry-script
  manifest is documented in docs/containers.md.
- ADR + implementation plan: flip status from Proposed /
  Phase 9 "Not started" to Implemented.
- AGENTS.md: extend the helper-crates list with
  index-cli-common and index-entry-script.
The implementation work for ADR-T-009 is done; the separate
phase-by-phase implementation plan that drove it has now
served its purpose. This change folds every decision the plan
recorded back into the ADR proper, deletes the plan file, and
migrates every cross-reference in the codebase to the ADR's
canonical `§D1`–`§D9` decision-section anchors. No runtime
behaviour changes — this is a docs-and-references pass on top
of the already-merged Phase 1–9 work.

Single source of truth
----------------------
- `adr/009-container-infrastructure-refactor.md`: rewritten as
  the standalone reference for the refactor. Status flips from
  "Implemented (Phases 1–9 complete)" to plain "Implemented";
  the prose drops the phase scaffolding and is reorganised
  around nine numbered decisions:
  - §D1 — Compose split (production-shaped baseline + auto-loaded
    dev override, wrapped by `make up-dev` / `make up-prod`).
  - §D2 — Strip credentials from shipped defaults; `tracker.token`
    and `database.connect_url` mandatory at the schema level.
  - §D3 — Entry script rewrite around the JSON config probe and
    the sourced POSIX shell library.
  - §D4 — Lean release runtime base + curated busybox applet set;
    `/busybox/` only on the debug image.
  - §D5 — Helper-binary dep-closure discipline (`index-cli-common`,
    JSON-on-stdout contract, root-only `0500 root:root` posture).
  - §D6 — `API_PORT` / `IMPORTER_API_PORT` demoted from build-time
    `ARG` to runtime `ENV`.
  - §D7 — `USER_ID` guard tightened to "numeric, non-zero" (was
    `>= 1000`).
  - §D8 — Vendored `su-exec.c` provenance + SHA-256-anchored
    audit log + CI guard.
  - §D9 — Build-context hygiene (`adr/`, `docs/` excluded via
    `.containerignore`).
- `adr/009-implementation-plan.md`: deleted (~3000 lines). Every
  durable decision lives in the ADR; the historical phase
  ordering, file lists, and merge-conflict notes belong to the
  git history of the now-merged work.

Cross-reference migration
-------------------------
- `Containerfile`, `share/container/entry_script_sh`,
  `contrib/dev-tools/su-exec/AUDIT.md`,
  `.github/workflows/container.yaml`,
  `packages/index-config-probe/src/{lib.rs,bin/torrust-index-config-probe.rs}`,
  `docs/containers.md`: every `§N` / "implementation plan §N.M"
  reference rewritten as `ADR-T-009 §DN`. Comments that pointed
  at "Phase 9 §9" or "implementation plan §6.1" now point at the
  ADR's permanent anchors so the references survive any future
  reorganisation of the ADR's internal layout.

CHANGELOG
---------
- Expand the previously placeholder ADR-T-009 bullet into a
  full Added / Changed / Removed audit covering the seven new
  workspace artefacts (`index-cli-common`, `index-config-probe`,
  `index-entry-script`, the sourced shell library,
  `Makefile`, `compose.override.yaml`,
  `contrib/dev-tools/su-exec/AUDIT.md`), the `jq_donor`
  Containerfile stage, `Info::from_env`, the
  `DEFAULT_CONFIG_TOML_PATH` re-export, and the
  `ENTRY_ENV_VARS` manifest contract; the `release` /
  `debug` runtime base split, the production-shaped
  `compose.yaml`, the `API_PORT` / `IMPORTER_API_PORT`
  demotion, the rewritten health-check and auth-keypair
  helpers, and the `USER_ID` guard change (flagged
  **BREAKING**); and the credential / `[mail.smtp]` /
  `[auth]`-paths removal from the shipped TOMLs together with
  the `Default` impls (`Settings`, `Tracker`, `Database`)
  and the two retired `src/bin/*` helpers.

README and crate docs
---------------------
- `README.md`: note that the documented "for deployment, you
  __should__ override the tracker token" guidance is now
  schema-mandatory (per §D2), not advisory.
- `src/lib.rs`: update the "Run with docker" example to supply
  the two now-mandatory `..._OVERRIDE_TRACKER__TOKEN` /
  `..._OVERRIDE_DATABASE__CONNECT_URL` env vars (and drop the
  `sqlite3 … VACUUM` pre-seed step, since the entry script's
  §D3 seed-on-empty path handles that). Cross-link the
  container guide and ADR-T-009; document the Compose split
  and the `make up-dev` / `make up-prod` wrappers. Refresh the
  Configuration section to mark `tracker.token` /
  `database.connect_url` as mandatory, point at the standalone
  `torrust-index-config` crate, and fix the env-var name typo
  (`TORRUST_INDEX_CONFIG_PATH` → `TORRUST_INDEX_CONFIG_TOML_PATH`).
@peer-cat peer-cat force-pushed the feat/rework_container_system branch from 37d86a9 to 8b1717a Compare April 22, 2026 15:12
The `runtime_release` stage targets `gcr.io/distroless/cc-debian13`,
which is usrmerged — `/bin` is a symlink to `/usr/bin`, not a real
directory. The previous `COPY --from=runtime_assets / /` blanket copy
failed under BuildKit with `cannot copy to non-directory` because the
donor stage ships `/bin/` as a real directory and BuildKit refuses to
overwrite the destination symlink with a directory.

Replace the recursive copy with explicit per-path copies and
materialise the curated busybox applet symlinks under `/usr/bin/`
(the canonical usrmerged location) instead of `/bin/`. `/bin/<applet>`
still resolves correctly via the base's `/bin → /usr/bin` symlink, so
`PATH` lookups and any hard-coded `/bin/sh` shebangs continue to work
unchanged. The `RUN` invocation is updated to call
`/usr/bin/busybox` directly so it does not depend on the symlinks it
is about to create.

No change to the curated applet set or to ADR-T-009 §D4.
Neither `camino` nor `serde_with` is referenced from the root crate
any longer — both were pulled in by code that has since migrated to
helper crates (which declare their own dependencies). Removing the
stale entries from the root `Cargo.toml` shrinks the dep closure and
lets `Cargo.lock` drop the corresponding transitive entries.

No functional change; `cargo check --workspace --all-targets
--all-features` is unaffected.
`configuration_should_have_a_test_constructor` was racing against
sibling tests that mutate `TORRUST_INDEX_CONFIG_OVERRIDE_*` env
vars. Without serialisation, those overrides leaked across the
parallel run and `Configuration::for_tests()` would occasionally
read an overridden tracker token instead of the `MyAccessToken`
value baked into `PLACEHOLDER_TOML`.

Wrap the body in `figment::Jail::expect_with`, which acquires
figment's global mutex and snapshots the process environment for
the duration of the closure. Switch the test from `#[tokio::test]`
to a plain `#[test]` and read the settings via `blocking_read`,
so the synchronous Jail closure does not call into an async
runtime.
The host-side `cargo test` invocations in the MySQL and SQLite e2e
runner scripts load the same config TOMLs the containers use, which
since ADR-T-009 §D2 intentionally omit `tracker.token` and
`database.connect_url`. Without overrides, `Configuration::load`
rejects them and the test process fails before reaching any e2e
assertion.

Export `TORRUST_INDEX_CONFIG_OVERRIDE_TRACKER__TOKEN` and
`TORRUST_INDEX_CONFIG_OVERRIDE_DATABASE__CONNECT_URL` alongside the
existing `TORRUST_INDEX_E2E_*` vars in all three invocations (MySQL,
SQLite public, SQLite private), reusing the connect URL the runner
already knows about and the `MyAccessToken` placeholder used
elsewhere in the test suite. Add a short comment at each call site
pointing operators at ADR-T-009 §D2 so the rationale survives the
next time someone wonders why the host wrapper sets vars the
container does not.
Commit 1971942 wrapped `configuration_should_have_a_test_constructor`
in a `figment::Jail` to stop sibling tests from leaking
`TORRUST_INDEX_CONFIG_OVERRIDE_*` into it, but the same leak path
is open the other way round: the e2e runner scripts (see
ADR-T-009 §D2) export `TORRUST_INDEX_CONFIG_OVERRIDE_DATABASE__CONNECT_URL`
and `TORRUST_INDEX_CONFIG_OVERRIDE_TRACKER__TOKEN` into the ambient
shell before invoking `cargo test`. Any unit test that loads the
default configuration and asserts on its contents then sees those
ambient overrides merged on top of the fixture and the equality
checks diverge.

Extract the scrub into a `clear_inherited_config_env` helper and
call it at the top of every `figment::Jail::expect_with` closure in
this module — the existing `for_tests()` test plus the three TOML
loader tests. The helper removes `TORRUST_INDEX_CONFIG_TOML`,
`TORRUST_INDEX_CONFIG_TOML_PATH`, and any
`TORRUST_INDEX_CONFIG_OVERRIDE_*` keys; it is only safe inside a
Jail closure because figment's global mutex serialises tests and
snapshots/restores the process environment around them, which the
SAFETY note on the `unsafe { remove_var }` block spells out for
the Edition 2024 reader.

No production code changes; the tests now pass deterministically
whether or not the surrounding shell has the e2e overrides set.
…ntract

`it_should_allow_admins_to_get_all_the_settings` compares the live
settings response with `env.server_settings_masking_secrets()`, which
the host loads from the same TOML the container starts from. The
container's entry script (ADR-T-009 §7) however augments that config
before the daemon comes up — when no PEM and no path is supplied for
`auth.{private,public}_key`, it defaults the key paths to
`/etc/torrust/index/auth/{private,public}.pem`, and the runtime DB
URL points at the in-container bind-mount target
(`/var/lib/torrust/index/database/...`) rather than the host path the
runner uses (`./storage/index/lib/database/...`).

The result is two fields that legitimately diverge between expected
and actual even on a green run, so the equality assertion fired on
environment plumbing rather than on real contract drift. Clear
`auth.private_key_path`, `auth.public_key_path`, and
`database.connect_url` on both sides before comparing, with a comment
explaining why each one is masked and pointing at ADR-T-009 §7. The
rest of the response — including the masked secrets the test was
originally written to cover — is still asserted byte-for-byte.
@peer-cat peer-cat marked this pull request as ready for review April 23, 2026 06:40
@peer-cat peer-cat requested a review from a team as a code owner April 23, 2026 06:40
@da2ce7 da2ce7 requested a review from Copilot April 23, 2026 06:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements ADR-T-009’s container-infrastructure refactor by extracting config parsing and container helper tooling into dedicated workspace crates, reshaping Compose into a production baseline + dev override, and updating tests/docs to reflect schema-mandatory credentials and the new entry-script/probe workflow. Also includes an auth correctness tweak in the permission extractor.

Changes:

  • Extracted configuration parsing into torrust-index-config (+ added config-probe, health-check, auth-keypair, cli-common, and entry-script test harness crates).
  • Reshaped container runtime/Compose workflow (baseline compose.yaml, compose.override.yaml, make up-dev/make up-prod, new CI lint guards).
  • Updated application/tests/docs for mandatory tracker.token + database.connect_url, TLS spelling fix (tsltls), and helper-binary JSON stdout contracts.

Reviewed changes

Copilot reviewed 91 out of 101 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/environments/isolated.rs Switch test settings fixture to placeholder settings
tests/e2e/web/api/v1/contexts/settings/contract.rs Normalise env-specific fields for settings comparison
tests/e2e/config.rs Make bootstrap config test hermetic under figment::Jail
src/web/api/server/v1/extractors/require_permission.rs Auth/permission extractor error handling tweak
src/web/api/server/v1/contexts/settings/mod.rs Fix docs typo tsltls
src/web/api/server/mod.rs Rename TLS type + re-export DynError from config crate
src/web/api/mod.rs Thread TLS option through API start
src/tests/web/require_permission.rs Use Configuration::for_tests()
src/tests/services/settings.rs Use Configuration::for_tests()
src/tests/jwt.rs Use Configuration::for_tests()
src/tests/config/mod.rs Stabilise config tests vs inherited env overrides
src/tests/bootstrap/config.rs Inject mandatory secrets via env overrides in tests
src/services/authorization.rs Re-export permission value types from config crate
src/lib.rs Update container/config docs for mandatory secrets + new guides
src/jwt.rs Update helper-binary naming/contract docs
src/config/mod.rs Re-export config parsing surface + add Configuration::for_tests()
src/bootstrap/config.rs Re-export default config path from config crate
src/bin/health_check.rs Remove legacy health-check binary (moved to package)
src/bin/generate_auth_keypair.rs Remove legacy keypair generator (moved to package)
src/app.rs Rename net tsltls usage
share/default/config/tracker.public.e2e.container.sqlite3.toml Remove shipped token secret
share/default/config/tracker.private.e2e.container.sqlite3.toml Remove shipped token secret
share/default/config/index.public.e2e.container.toml Remove creds/paths; keep runtime-driven fields
share/default/config/index.public.e2e.container.sqlite3.toml Remove redundant sample file
share/default/config/index.private.e2e.container.sqlite3.toml Remove creds/paths from sample
share/default/config/index.development.sqlite3.toml Switch to [net.tls] + remove placeholder secrets
share/default/config/index.container.toml Remove creds/paths from sample
share/default/config/index.container.sqlite3.toml Remove redundant sample file
share/container/entry_script_lib_sh Add POSIX shell helper library for entry script
packages/index-health-check/tests/health_check.rs Health-check integration tests
packages/index-health-check/src/tests/mod.rs Health-check unit tests
packages/index-health-check/src/lib.rs Minimal TCP-based health check implementation
packages/index-health-check/src/bin/torrust-index-health-check.rs Health-check JSON/TTY contract binary
packages/index-health-check/Cargo.toml New crate manifest
packages/index-entry-script/tests/validate_auth_keys.rs Host-side tests for shell validate_auth_keys
packages/index-entry-script/tests/seed_sqlite.rs Host-side tests for shell seed_sqlite
packages/index-entry-script/src/lib.rs Test harness to execute sourced shell helpers
packages/index-entry-script/Cargo.toml New crate manifest
packages/index-config/tests/shipped_samples.rs Ensure shipped samples are TOML-valid but secret-free
packages/index-config/tests/round_trip.rs Round-trip tests via TOML/JSON public API
packages/index-config/tests/permission_overrides.rs Permission override parsing tests
packages/index-config/src/validator.rs New validator trait + error type
packages/index-config/src/v2/website.rs Website schema module extraction
packages/index-config/src/v2/tracker_statistics_importer.rs Importer schema module extraction
packages/index-config/src/v2/tracker.rs Make tracker token schema-mandatory + add ApiToken::is_empty()
packages/index-config/src/v2/registration.rs Registration/email schema module extraction
packages/index-config/src/v2/permissions.rs Wire permission override types within config crate
packages/index-config/src/v2/net.rs Rename network TLS config (tsltls)
packages/index-config/src/v2/mod.rs Make tracker/database sections schema-mandatory
packages/index-config/src/v2/mail.rs Mail/SMTP schema module extraction
packages/index-config/src/v2/logging.rs Logging schema module extraction
packages/index-config/src/v2/image_cache.rs Image-cache schema module extraction
packages/index-config/src/v2/database.rs Make database connect_url schema-mandatory
packages/index-config/src/v2/auth.rs Auth/JWT schema module extraction + key resolution helpers
packages/index-config/src/v2/api.rs API schema module extraction
packages/index-config/src/tests/redaction.rs Secret redaction tests
packages/index-config/src/tests/quirks.rs Edge-case config tests
packages/index-config/src/tests/permissions.rs Permission types round-trip tests
packages/index-config/src/tests/mod.rs Config crate test suite organisation
packages/index-config/src/tests/metadata.rs Metadata/version tests
packages/index-config/src/tests/loader.rs Loader error/mandatory-field tests
packages/index-config/src/test_helpers.rs Shared placeholder TOML/settings fixtures
packages/index-config/src/permissions.rs Permission value types moved into config crate
packages/index-config/src/lib.rs Config loader + shared constants/types
packages/index-config/Cargo.toml New crate manifest
packages/index-config-probe/tests/binary.rs Probe contract tests (no subprocess)
packages/index-config-probe/src/tests/mod.rs Probe unit tests
packages/index-config-probe/src/lib.rs Probe library (DB/auth subset)
packages/index-config-probe/src/bin/torrust-index-config-probe.rs Probe binary with exit-code contract
packages/index-config-probe/Cargo.toml New crate manifest
packages/index-cli-common/src/lib.rs Shared CLI JSON/TTY/tracing helpers
packages/index-cli-common/Cargo.toml New crate manifest
packages/index-auth-keypair/tests/keypair_generation.rs Keypair integration tests
packages/index-auth-keypair/src/tests/mod.rs Keypair unit tests
packages/index-auth-keypair/src/lib.rs Keypair generation library
packages/index-auth-keypair/src/bin/torrust-index-auth-keypair.rs Keypair binary with JSON stdout
packages/index-auth-keypair/Cargo.toml New crate manifest
contrib/dev-tools/su-exec/AUDIT.md Vendored su-exec provenance + SHA-anchored audit log
contrib/dev-tools/container/run.sh Remove stale container run script
contrib/dev-tools/container/e2e/sqlite/run-e2e-tests.sh Update e2e runner env overrides for mandatory secrets
contrib/dev-tools/container/e2e/sqlite/mode/public/e2e-env-up.sh Update e2e env-up config + inject connect_url override
contrib/dev-tools/container/e2e/sqlite/mode/public/e2e-env-down.sh Update e2e env-down config path
contrib/dev-tools/container/e2e/sqlite/mode/private/e2e-env-up.sh Inject connect_url override for private-mode e2e
contrib/dev-tools/container/e2e/mysql/run-e2e-tests.sh Update mysql e2e runner config + mandatory overrides
contrib/dev-tools/container/e2e/mysql/e2e-env-up.sh Update mysql env-up config + inject connect_url override
contrib/dev-tools/container/e2e/mysql/e2e-env-down.sh Update mysql env-down config path
contrib/dev-tools/container/build.sh Remove stale container build script
compose.yaml Production-shaped baseline compose (with env/port hardening)
compose.override.yaml Dev sandbox override (mailcatcher/tty/default creds)
README.md Update container run docs + compose workflow
Makefile Add up-dev / up-prod entrypoints + env validation
Cargo.toml Expand workspace members + add config crate dependency
Cargo.lock Lockfile updates for new workspace crates
CHANGELOG.md Document ADR-T-009 additions/breaking changes
AGENTS.md Add contributor guidance updates
.github/workflows/container.yaml Add container-infra lint guards + job dependency
.containerignore Exclude docs/adr from container build context

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/index-config/src/lib.rs Outdated
Comment thread packages/index-health-check/src/bin/torrust-index-health-check.rs Outdated
Comment thread packages/index-auth-keypair/src/bin/torrust-index-auth-keypair.rs Outdated
Comment thread packages/index-entry-script/src/lib.rs Outdated
Comment thread AGENTS.md Outdated
Comment thread packages/index-config/src/lib.rs Outdated
Comment thread compose.yaml Outdated
Comment thread packages/index-health-check/src/lib.rs Outdated
Comment thread src/web/api/server/v1/extractors/require_permission.rs Outdated
A grab-bag of small but security- and operability-relevant fixes
uncovered while reworking the container system.

* `compose.yaml`: drop the inline `MyAccessToken` default and the
  inline SQLite `connect_url` default from the production compose
  file. Both are credentials/connection strings and now use bare
  `${VAR}` form per ADR-T-009 §8.1; the dev sandbox keeps its
  defaults in `compose.override.yaml`.

* `index-config`: stop printing config-loading messages to stdout.
  The TOML body may contain DB connect URLs, API tokens, or SMTP
  passwords — log only the env-var name through `tracing` (stderr)
  so we don't pollute the JSON-only stdout contract used by helper
  binaries (P9). Only the env-var *name* is ever emitted, never
  its value.

* `index-config`: fix the `[net.tls]` defaulting quirk where an
  empty table synthesised `Some("")` paths and later failed at TLS
  load time with a misleading "no such file" error. Missing fields
  now default to `None`, matching the empty-string behaviour. New
  regression test in `quirks.rs`.

* `index-health-check`: resolve `host:port` ourselves and use
  `TcpStream::connect_timeout` so the probe fails fast instead of
  inheriting the OS's multi-tens-of-seconds SYN-retransmit
  schedule — important for orchestrator health checks. Stop
  swallowing `set_{read,write}_timeout` errors silently. Return
  `ExitCode::FAILURE` on stdout-write failure (e.g. broken pipe)
  instead of panicking.

* `index-auth-keypair`: same broken-pipe fix — honour the helper's
  documented exit-code contract on stdout write failure rather
  than panicking out of `unwrap()`.

* `index-entry-script`: correct the `run_sh_with_args` doc-comment.
  Arguments are forwarded as `argv` entries to `sh -c` (with a
  `"sh"` placeholder for `$0`), not spliced into the script text;
  no quoting is performed or required.

* `require_permission` extractor: stop collapsing every database
  error into a 404 `UserNotFound`. Only the genuine
  `Error::UserNotFound` becomes a 404; DB/IO failures now surface
  as `DatabaseError` (500) and are logged, so operator-visible
  outages aren't masked as missing users.

* `AGENTS.md`: rewrite the misleading "POSIX paths must be
  null-terminated" guidance. Paths are opaque byte sequences;
  prefer `OsStr`/`PathBuf` (or `Utf8Path` when UTF-8 really is a
  precondition); NUL termination is only relevant at the
  libc/FFI boundary.
… to IPv4

The container HEALTHCHECK probes `http://localhost:<port>/health_check`,
but glibc resolves `localhost` to `::1` first and `127.0.0.1` second.
The index API binds to `0.0.0.0` (IPv4 only) and the tracker statistics
importer to `127.0.0.1`, so the probe was getting `Connection refused`
on `::1` and never trying the listening IPv4 socket — the container
reported as unhealthy despite both services being up.

Replace the single-address `TcpStream::connect_timeout` with a small
RFC 8305 ("Happy Eyeballs v2") implementation:

  * Resolve every address, partition by family, then interleave
    v6/v4/v6/v4 so an IPv6 address is always attempted first.
  * Spawn one connect thread per address, staggered by 250 ms
    (the spec's default "Connection Attempt Delay"), each bounded
    by the remaining overall timeout budget.
  * First successful `TcpStream` wins; losers are abandoned and
    reaped by the OS when the CLI exits.

Single-address inputs keep the original fast path (no threads, no
scheduling). IPv6 literals like `http://[::1]:port/` continue to work
end-to-end through the URL parser and `to_socket_addrs`.

Crate tests cover the regression directly:

  * `localhost_falls_back_to_ipv4` — IPv4-only listener succeeds via
    a `localhost` URL that resolves to both families.
  * `ipv6_literal_url_is_supported` — bracketed-literal URL
    round-trips through parser + resolver.
  * `prefers_ipv6_when_both_listen` — dual-stack race confirms the
    IPv6 attempt fires first and wins.

Each test self-skips with a logged reason if the host's resolver or
kernel can't satisfy the precondition (e.g. IPv6-disabled CI runner,
`bindv6only=0` claiming the v4 wildcard), so the suite stays green in
restricted environments without giving a false pass.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements ADR-T-009’s container-infrastructure refactor by extracting configuration + helper tooling into dedicated workspace crates, reshaping the container/compose story for production vs dev, and updating tests/docs to match the new “mandatory credentials at schema level” configuration contract (plus a small auth correctness fix around “user not found” vs DB errors).

Changes:

  • Extracts config parsing into torrust-index-config (and adds helper crates like config-probe, health-check, auth-keypair, entry-script test harness).
  • Updates configuration schema/fixtures so tracker.token and database.connect_url are mandatory (no shipped defaults), and adjusts tests/docs/compose accordingly.
  • Introduces production-shaped compose.yaml + dev-only compose.override.yaml, plus CI lints and a Makefile wrapper for validated prod bring-up.

Reviewed changes

Copilot reviewed 91 out of 101 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/environments/isolated.rs Switches test fixture from Settings::default() to shared placeholder settings fixture.
tests/e2e/web/api/v1/contexts/settings/contract.rs Normalizes environment-dependent settings fields before asserting equality.
tests/e2e/config.rs Makes bootstrap config test hermetic under figment::Jail and injects mandatory overrides.
src/web/api/server/v1/extractors/require_permission.rs Improves error-category preservation/logging when loading user for permission checks fails.
src/web/api/server/v1/contexts/settings/mod.rs Fixes docs typo (tsltls) in settings JSON example.
src/web/api/server/mod.rs Renames TSL→TLS, re-exports DynError from extracted config crate, updates comments.
src/web/api/mod.rs Propagates TSL→TLS rename through API start wiring.
src/tests/web/require_permission.rs Uses Configuration::for_tests() in tests after removal of default settings.
src/tests/services/settings.rs Uses Configuration::for_tests() for settings service tests.
src/tests/jwt.rs Uses Configuration::for_tests() for JWT tests.
src/tests/config/mod.rs Adds env-scrubbing helper + updates tests to use placeholder-derived defaults under figment::Jail.
src/tests/bootstrap/config.rs Injects mandatory tracker/database env overrides for bootstrap config test.
src/services/authorization.rs Moves permission value types to config crate and re-exports them here for compatibility.
src/lib.rs Updates public docs for new mandatory config fields + compose split guidance.
src/jwt.rs Updates docs to reference new torrust-index-auth-keypair helper binary and JSON output.
src/config/mod.rs Re-exports config crate parsing surface and adds Configuration runtime wrapper + for_tests().
src/bootstrap/config.rs Re-exports default config TOML path from config crate for single source of truth.
src/bin/health_check.rs Removes old health check binary (replaced by workspace crate).
src/bin/generate_auth_keypair.rs Removes old keypair generator binary (replaced by workspace crate).
src/app.rs Wires net.tls (renamed from net.tsl) through application startup.
share/default/config/tracker.public.e2e.container.sqlite3.toml Removes shipped tracker token from e2e tracker sample (now operator-supplied).
share/default/config/tracker.private.e2e.container.sqlite3.toml Removes shipped tracker token from e2e tracker sample (now operator-supplied).
share/default/config/index.public.e2e.container.toml Removes credentials/auth-path defaults from shipped sample; keeps structural settings.
share/default/config/index.public.e2e.container.sqlite3.toml Removes old sqlite3 container sample file.
share/default/config/index.private.e2e.container.sqlite3.toml Removes credentials/auth-path defaults from shipped sample.
share/default/config/index.development.sqlite3.toml Updates sample to [net.tls] and removes credential defaults.
share/default/config/index.container.toml Removes credentials/auth-path defaults from shipped container sample.
share/default/config/index.container.sqlite3.toml Removes old sqlite3 container sample file.
share/container/entry_script_lib_sh Adds POSIX-sh helper library for entry script invariants + sqlite seeding.
packages/index-health-check/tests/health_check.rs Adds integration tests for health-check behavior and JSON output fields.
packages/index-health-check/src/tests/mod.rs Adds unit tests covering HTTP parsing, timeouts, IPv4/IPv6 behavior.
packages/index-health-check/src/lib.rs Introduces minimal stdlib-only health check with Happy Eyeballs connect.
packages/index-health-check/src/bin/torrust-index-health-check.rs Adds helper binary wiring (TTY refusal, JSON tracing, JSON stdout).
packages/index-health-check/Cargo.toml Defines new health-check crate and dependencies.
packages/index-entry-script/tests/validate_auth_keys.rs Tests shell helper validate_auth_keys contract host-side via sh.
packages/index-entry-script/tests/seed_sqlite.rs Tests shell helper seed_sqlite contract host-side via sh.
packages/index-entry-script/src/lib.rs Provides harness to embed and source shell lib for host-side tests.
packages/index-entry-script/Cargo.toml Defines entry-script test harness crate.
packages/index-config/tests/shipped_samples.rs Ensures shipped index.*.toml samples are valid TOML but omit credentials and fail schema until overridden.
packages/index-config/tests/round_trip.rs Adds round-trip tests through TOML/JSON APIs using public surface.
packages/index-config/tests/permission_overrides.rs Validates [[permissions.overrides]] parsing, ordering, and error cases.
packages/index-config/src/validator.rs Adds validator trait + config validation error type (with message).
packages/index-config/src/v2/website.rs Adds website configuration types and defaults in extracted config crate.
packages/index-config/src/v2/tracker_statistics_importer.rs Adds tracker statistics importer settings with defaults.
packages/index-config/src/v2/tracker.rs Makes token mandatory, removes defaults, and adds ApiToken::is_empty() helper.
packages/index-config/src/v2/registration.rs Adds registration/email config types with defaults.
packages/index-config/src/v2/permissions.rs Adjusts permission override import to config crate location.
packages/index-config/src/v2/net.rs Renames tsltls and uses extracted Tls type.
packages/index-config/src/v2/mod.rs Makes tracker/database sections mandatory (no schema-level default); removes Settings::default().
packages/index-config/src/v2/mail.rs Adds mail/SMTP config types and defaults.
packages/index-config/src/v2/logging.rs Adds logging config + Threshold enum.
packages/index-config/src/v2/image_cache.rs Adds image cache config with defaults.
packages/index-config/src/v2/database.rs Makes connect_url mandatory and removes defaults.
packages/index-config/src/v2/auth.rs Adds auth config with PEM/path resolution helpers and defaults.
packages/index-config/src/v2/api.rs Adds API paging config with defaults.
packages/index-config/src/tests/redaction.rs Adds tests for secret redaction behavior in settings.
packages/index-config/src/tests/quirks.rs Adds edge-case tests (TLS empty strings, IPv6, unicode token, validator behavior).
packages/index-config/src/tests/permissions.rs Adds tests for permission value types round-tripping and invariants.
packages/index-config/src/tests/mod.rs Adds test suite structure and shared test helper exports.
packages/index-config/src/tests/metadata.rs Adds tests for metadata/version/app/purpose serialization and defaults.
packages/index-config/src/tests/loader.rs Adds tests covering loader success + failure modes and error mapping.
packages/index-config/src/test_helpers.rs Adds canonical placeholder TOML + placeholder settings fixture for cross-crate tests.
packages/index-config/src/permissions.rs Moves permission value types into config crate for schema reuse + re-export.
packages/index-config/src/lib.rs Introduces extracted config crate API (types, loader, constants, TLS type, Info/Error).
packages/index-config/Cargo.toml Defines new config crate and dependency set.
packages/index-config-probe/tests/binary.rs Adds contract tests for probe exit codes and JSON emission without spawning the binary.
packages/index-config-probe/src/tests/mod.rs Adds unit tests for probe behavior (sqlite path extraction, auth key probing, scheme gates).
packages/index-config-probe/src/lib.rs Adds config-probe library emitting container-relevant JSON from loaded settings.
packages/index-config-probe/src/bin/torrust-index-config-probe.rs Adds config-probe helper binary with exit-code contract and panic hook mapping.
packages/index-config-probe/Cargo.toml Defines new config-probe crate and dependency set.
packages/index-cli-common/src/lib.rs Adds shared helper-binary scaffolding (TTY refusal, JSON tracing, JSON emit, --debug).
packages/index-cli-common/Cargo.toml Defines CLI-common crate.
packages/index-auth-keypair/tests/keypair_generation.rs Adds integration tests for JSON output + PEM validity and uniqueness.
packages/index-auth-keypair/src/tests/mod.rs Adds unit tests for keypair generation outputs.
packages/index-auth-keypair/src/lib.rs Adds RSA-2048 keypair generator library returning JSON-serializable output.
packages/index-auth-keypair/src/bin/torrust-index-auth-keypair.rs Adds helper binary for keypair generation using shared CLI scaffolding.
packages/index-auth-keypair/Cargo.toml Defines auth-keypair crate and dependencies.
contrib/dev-tools/su-exec/AUDIT.md Adds provenance + append-only SHA-256 audit log for vendored su-exec.c.
contrib/dev-tools/container/run.sh Removes stale container run helper script.
contrib/dev-tools/container/e2e/sqlite/run-e2e-tests.sh Updates e2e runner to supply mandatory overrides and updated TOML paths.
contrib/dev-tools/container/e2e/sqlite/mode/public/e2e-env-up.sh Updates compose env-up script for new TOML + mandatory DB connect URL override.
contrib/dev-tools/container/e2e/sqlite/mode/public/e2e-env-down.sh Updates compose env-down script for renamed TOML.
contrib/dev-tools/container/e2e/sqlite/mode/private/e2e-env-up.sh Adds mandatory DB connect URL override for private mode.
contrib/dev-tools/container/e2e/mysql/run-e2e-tests.sh Updates mysql e2e runner to supply mandatory overrides and updated TOML path.
contrib/dev-tools/container/e2e/mysql/e2e-env-up.sh Updates compose env-up script for new TOML + mandatory DB connect URL override.
contrib/dev-tools/container/e2e/mysql/e2e-env-down.sh Updates compose env-down script for renamed TOML.
contrib/dev-tools/container/build.sh Removes stale container build helper script.
compose.yaml Refactors compose baseline for production posture (no dev sidecars/default creds, localhost-bound ports).
compose.override.yaml Adds dev-only overrides (mailcatcher, tty, permissive credential defaults).
README.md Updates container run instructions and documents compose split workflow.
Makefile Adds up-dev / up-prod targets with env validation for production bring-up.
Cargo.toml Expands workspace members and adds dependency on extracted config crate.
Cargo.lock Updates lockfile for new workspace crates and dependency graph.
CHANGELOG.md Documents ADR-T-009 changes, breaking config requirements, and extracted helper crates.
AGENTS.md Adds commit-message guidance, POSIX-path guidance, and updates atomic-write instructions.
.github/workflows/container.yaml Adds container-infra lint job (compose baseline audit, su-exec hash guard, env-var docs guard).
.containerignore Excludes adr/ and docs/ from container build context.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +96 to +100
// The leading `set -eu` mirrors the discipline the entry
// script itself runs under. `. "$lib"` sources the
// helpers; the user-supplied snippet runs after.
let script = format!(". \"{lib}\"\n{snippet}");

Comment thread src/config/mod.rs
Comment on lines +75 to +87
#[cfg(test)]
impl Configuration {
/// Build a `Configuration` from the shared placeholder TOML for
/// use in tests. Replaces the previous `Configuration::default()`
/// — `Settings` no longer carries an `impl Default` (ADR-T-009 §D2).
///
/// The TOML literal lives in
/// [`torrust_index_config::test_helpers::PLACEHOLDER_TOML`] so
/// every crate-boundary test fixture stays in sync.
#[must_use]
pub(crate) fn for_tests() -> Self {
Self::load(&Info::from_toml(torrust_index_config::test_helpers::PLACEHOLDER_TOML)).expect("placeholder TOML must load")
}
Comment thread AGENTS.md

## Commit Messages

When writing a commit message, be sure to review the last few commit message to compare the style.
@da2ce7 da2ce7 added the Needs Rebase Base Branch has Incompatibilities label Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs Rebase Base Branch has Incompatibilities

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants