Skip to content

Commit a26ee1d

Browse files
authored
security: JWT issuer validation, JWKS rate limiting, audience check (#472) (#519)
* security: JWT issuer validation, JWKS rate limiting, audience check (#472) Implement three security hardening items from issue #472: SEC-003: Validate JWT iss claim before JWKS routing - must be HTTPS URL with non-empty host, max 512 chars. Rejects http/file/javascript schemes. SEC-004: Per-issuer JWKS fetch rate limiting with 10s cooldown to prevent DoS amplification via tokens with valid issuers but unknown kid values. SEC-005: When IdentityProvider clientID is configured, validate JWT aud claim against it to prevent cross-service token confusion. Items already addressed in prior PRs (confirmed, no changes needed): - SEC-006: X-Request-ID sanitization (isValidRequestID) - SEC-008: Build info endpoint (GetPublicBuildInfo) - SEC-015: hardenedIDPHints defaults to true - SEC-007: SAR webhook auth documented - SEC-017: Token storage documented Closes #472 * fix: make audience validation configurable via expectedAudience field * fix: address Copilot review findings on JWT security hardening - Add singleflight deduplication for concurrent JWKS fetches (SEC-004) - Move jwksFetchLimiter.Store to after successful JWKS fetch - Clean up rate limiter entries on LRU cache eviction - Use bounded metric labels (resolved IDP name) instead of raw issuer - Tighten isValidIssuer to reject query/fragment/userinfo components - Apply audience validation in tryExtractUserIdentity - Differentiate CHANGELOG SEC-005 entries (SEC-005a/SEC-005b) - Add multi-IDP audience validation tests - Expand rate limit test coverage (cooldown expiry) * fix: address review comments on security hardening PR - Canonicalize JWT issuer (trim trailing slashes) before cache/limiter key lookup to prevent bypass via slash variants - Rename Prometheus label from 'issuer' to 'identity_provider' on JWT validation metrics to match actual emitted values (IDP names) and prevent unbounded cardinality from attacker-controlled issuers - Replace NUL bytes in long issuer test with repeated 'a' characters for portability and clarity - Update metrics.md to reflect label rename * security: validate JWKS URI from OIDC discovery endpoint Defense-in-depth: validate the jwks_uri returned by OIDC discovery before using it, preventing SSRF if a configured IDP is compromised and returns a malicious URI (e.g., file://, http://, internal hosts). * fix: record metrics with resolved IDP name, canonicalize issuer in tryExtractUserIdentity, refresh expectedAudience on cache hit * fix: TTL-based audience refresh on JWKS cache hit, consistent 'unknown' label for early failures, expanded JWT error classification * fix: address review comments on SEC-003/SEC-004/SEC-005 - Fix data race: read entry.expectedAudience and audienceRefreshedAt under jwksMutex before releasing the lock - Normalize selectedIDP to 'unknown' for all JWT validation metrics (failure, success, duration) for consistent bounded cardinality - Rename isValidIssuer to isValidHTTPSURL since it validates both JWT issuer and OIDC discovery jwks_uri URLs - Update JWKS cache hit/miss metric labels from 'issuer' to 'identity_provider' to match actual values (IDP names) and align with other JWT metrics - Add clarifying comment on rate limiter scope (keyfunc background refreshes are bounded by RefreshInterval) * fix: address review comments - split JWKS/issuer URL validators, add request counters to early returns, rename test, update CHANGELOG * docs: add metrics label rename to upgrade guide * fix: cache IDP name to avoid double lookup, TrimSuffix for single slash, lowercase rate-limit error with retry hint * fix: count JWKS-failure validation attempts, trim all trailing slashes, correct metrics docs * fix: snapshot JWKS cache fields under lock, capture elapsed time once in rate-limiter * fix: validate ExpectedAudience against whitespace, fix wasCacheHit race, align metric test labels - Add Pattern=^\S+$ validation to ExpectedAudience field to reject whitespace-only values, consistent with ClientID validation - Return cacheHit from getJWKSForIssuer() instead of pre-checking cache state, eliminating race between check and actual fetch - Rename test variable from 'issuer' to 'idpName' in TestJWTMetrics to match the 'identity_provider' label semantics * fix: limit OIDC discovery response body size, use structured logging - Add io.LimitReader (1 MiB) to OIDC discovery response decoding to prevent OOM from malicious OIDC providers returning oversized payload - Replace Warnf with Warnw for structured logging consistency * fix: address review feedback for SEC hardening - Normalize trailing slashes in LoadIdentityProviderByIssuer primary match path (consistent with auth layer canonicalization) - Enforce JWKS URI host matches authority host to prevent SSRF via compromised OIDC discovery endpoints - Wrap audience refresh in singleflight to prevent thundering herd when many requests arrive after refresh interval elapses - Add unit tests for jwksHostMatchesAuthority, trailing slash normalization, and authority fallback normalization - Update CHANGELOG with new security entries * fix: address second round of review feedback - Rename isValidHTTPSURL to isValidIssuerURL for consistency with SEC-003 terminology - Update rate limiter comment to accurately describe initial JWKS client creation scope - Document issuer URL constraint on query strings, fragments, and userinfo in security docs - Correct SEC-004 docs: cooldown applies to initial load only; keyfunc manages refresh internally - Short-circuit empty issuer in optional auth path to prevent unnecessary loader calls and log noise * fix: retry session creation on escalation-not-found during informer sync E2E tests create escalations via direct K8s client then immediately call the REST API to create sessions. The API uses an informer cache to look up escalations, which may not have synced yet, causing transient 403 'no escalation found for requested group' errors. Add retry logic in CreateSession and CreateDebugSession to handle this transient condition with a 2s backoff between attempts (up to 3 retries). Only the specific 'no escalation found' 403 is retried — other 403 errors (identity mismatch, unauthorized group) are still immediate failures. * fix(e2e): add MailHog port-forward to Single-Cluster E2E setup step The notification e2e tests fail with 'connection refused' on port 8025 because the MailHog port-forward started by kind-setup-single.sh dies before the tests run. Add an explicit MailHog port-forward restart in the 'Setup port-forwards for E2E tests' step alongside Kafka and audit webhook receiver. * ci: increase multi-cluster E2E timeout to 45m The multi-cluster E2E test suite runs 280 tests including two OIDC offline-token tests that each take ~400s. Total runtime reaches ~30m, which is exactly the current timeout limit. The retry logic added for informer cache flakes adds a few seconds of backoff that can push the suite past the 30m deadline. Increase from 30m to 45m to provide adequate headroom. The job-level timeout-minutes remains at 60. * fix: address review comments — port validation, AddToScheme errors, retry context, CHANGELOG - jwksHostMatchesAuthority: validate hostname AND port (default 443 for HTTPS) to prevent SSRF via same-host-different-port redirects - identity_provider_loader_test.go: require.NoError for all AddToScheme calls - e2e/helpers/api.go: add sleepOrCancel for context-aware retries, preventing stale retries after test timeout/cancellation - e2e/helpers/api.go: isEscalationNotFound now checks status=403 - e2e/helpers/api.go: isTemplateNotFound now requires 'template' substring - auth_test.go: rename 'internal IP rejected' → 'different host rejected (internal IP)', add port-validation test cases - CHANGELOG.md: merge duplicate ### Changed sections, clarify SEC-004 scope (initial JWKS client creation only, not keyfunc/v3 internal refreshes), update JWKS URI validation to mention origin (hostname+port) * fix: address gosec findings — decompression limits, HTTP context, ReadHeaderTimeout - G110: Add io.LimitReader (500MB) for tar/zip extraction to prevent decompression bombs - G107: Replace http.Get with http.NewRequestWithContext for all HTTP requests in bgctl update - G112: Add ReadHeaderTimeout to OIDC callback HTTP server to prevent slowloris attacks - G204: Annotate intentional browser-open subprocess calls - G302: Annotate intentional 0755 permissions for executable binaries - G117: Annotate intentional ClientSecret inclusion in internal API transport * fix: merge duplicate CHANGELOG Fixed sections, add HTTP server timeouts - Merge duplicate '### Fixed' sections in Unreleased CHANGELOG into one - Add ReadTimeout, WriteTimeout, IdleTimeout to OIDC callback HTTP server to fully mitigate slowloris-style attacks (complements ReadHeaderTimeout) * fix: detect oversized binaries during archive extraction Replace direct io.Copy+io.LimitReader calls with limitedCopy helper that returns an error when the source data exceeds the size limit, preventing silently truncated binaries from being installed. Addresses copilot review comments on extractTarGz and extractZipEntry. * fix review findings for auth and bgctl update * harden bgctl update network and extraction paths * sanitize identity provider JSON and align cluster-scoped tests * back off repeated audience refresh failures * fix latest copilot review findings * validate IDP authority URLs before discovery * harden config JSON sanitization and release tag URL handling * use stable single-idp JWT metric label * split metadata and download timeouts for bgctl update
1 parent 6d31d40 commit a26ee1d

21 files changed

+1430
-157
lines changed

CHANGELOG.md

Lines changed: 21 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -11,29 +11,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1111

1212
- **Frontend: upgrade Vite 7 → 8 and @vitejs/plugin-legacy 7 → 8** (PR #562): Major version bumps — Vite 8 replaces Rollup with Rolldown and removes esbuild in favor of Oxc. `@vitejs/plugin-vue` patch bumped to 6.0.5. No breaking changes to the frontend build configuration.
1313

14-
### Fixed
15-
16-
- **E2E: retry on informer cache lag during session creation**: `CreateSession` and `CreateDebugSession` API helpers now retry on transient 403 ("no escalation found", "user not authorized for requested group") and 400 ("template not found") errors caused by informer cache propagation delay after creating escalation or template resources
17-
- **E2E: fix CreateSessionAndWaitForPending race condition**: Always poll the API to confirm the controller has reconciled the session status, rather than trusting the create response which may not reflect the persisted status subresource
18-
- **CI: increase Single-Cluster E2E job timeout from 45 to 60 minutes**: API E2E tests alone take ~30 minutes; the 45-minute job timeout was consistently hit before CLI E2E tests could complete
19-
- **CI: increase Multi-Cluster E2E test timeout from 30m to 45m**: The 280-test suite occasionally exceeded the 30-minute Go test timeout
20-
- **E2E: port-forward keepalive for CI stability**: All long-lived E2E port-forwards used during test execution (Keycloak, API, MailHog, Metrics, Kafka, audit webhook receiver) now use `while true` restart loops to auto-recover from idle timeouts, pod restarts, and network drops. Fixes flaky Single-Cluster E2E, OIDC E2E, and UI E2E tests caused by Keycloak port-forward dying mid-run
21-
- **E2E: added missing MailHog port-forward in CI workflow**: The CI workflow's "Setup port-forwards for E2E tests" step killed all port-forwards from `kind-setup-single.sh` but did not restart the MailHog port-forward, causing all notification e2e tests to fail with `connection refused` on port 8025
22-
- **E2E: increased OIDC token retry window**: Bumped token request retries from 5 to 8 attempts with capped 10s backoff (~60s total window) to tolerate port-forward reconnection delays
23-
- **E2E: increased GetToken retry window to 12 attempts**: Extended from 8 to 12 attempts (~120s total window) to tolerate Keycloak pod recovery during extended outages in CI
24-
- **E2E: added retry with exponential backoff to offline token requests**: `ObtainOfflineRefreshTokenWithRetry` now uses exponential backoff (matching `GetToken`) and callers increased from 3 to 8 attempts to survive port-forward restarts
25-
- **E2E: added retry to RequireKeycloakReachable pre-check**: The Keycloak reachability pre-check now retries 5 times with backoff instead of failing immediately on first port-forward drop
26-
- **E2E: added retry to ObtainClientCredentialsToken**: Client credentials token requests now retry 8 times with exponential backoff to tolerate Keycloak transient unavailability
27-
- **CI: increase API E2E and OIDC E2E Go test timeouts from 30m to 45m**: Keycloak token retry overhead during port-forward drops can consume 90+ seconds per incident, causing test suite timeouts at 30 minutes
28-
29-
### Changed
30-
3114
- **Webhook SAR metrics: removed high-cardinality `group` label** ([#527](https://github.com/telekom/k8s-breakglass/issues/527)): Removed unbounded `group` label from `breakglass_webhook_session_sar_{allowed,denied,errors}_total` metrics to prevent time-series explosion in Prometheus
15+
- **JWT and JWKS metrics label renamed from `issuer` to `identity_provider`** ([#472](https://github.com/telekom/k8s-breakglass/issues/472)): Prometheus metrics `breakglass_jwt_validation_*` and `breakglass_jwks_cache_{hits,misses}_total` now use the `identity_provider` label (resolved IDP name) instead of `issuer` (raw URL) to prevent unbounded cardinality from attacker-controlled issuer claims. Dashboards/alerts referencing the old `issuer` label on these metrics must be updated.
3216

3317
### Security
3418

35-
- **JWT audience validation preparation (SEC-005)** ([#459](https://github.com/telekom/k8s-breakglass/issues/459), [#472](https://github.com/telekom/k8s-breakglass/issues/472)): Added `clientID` plumbing from IDP config to JWT authenticator for future audience validation; audience validation is intentionally disabled by default as it depends on audience protocol mappers that are not configured in all environments — a dedicated CRD field is needed before enabling
36-
- **JWT expiration required (SEC-005)** ([#459](https://github.com/telekom/k8s-breakglass/issues/459)): JWT parser now rejects tokens without an `exp` claim via `jwt.WithExpirationRequired()`
19+
- **JWT issuer format validation (SEC-003)** ([#472](https://github.com/telekom/k8s-breakglass/issues/472)): Validate JWT `iss` claim format before JWKS routing — must be an HTTPS URL with a non-empty host and ≤512 characters; rejects `http://`, `file://`, `javascript:`, and other non-HTTPS schemes to prevent SSRF-like JWKS fetches
20+
- **Per-issuer JWKS fetch rate limiting (SEC-004)** ([#472](https://github.com/telekom/k8s-breakglass/issues/472)): Enforce 10-second minimum cooldown between initial JWKS client creation for the same issuer, preventing DoS amplification through tokens with crafted issuer claims. Once a JWKS client is cached, subsequent refreshes (including those triggered by unknown `kid` values) are managed by keyfunc/v3's built-in deduplication; network-level and IDP-side rate limits should be configured as additional defense-in-depth
21+
- **OIDC discovery JWKS URI origin validation (SEC-003)** ([#472](https://github.com/telekom/k8s-breakglass/issues/472)): Discovered `jwks_uri` origin (hostname and port) must match the configured authority to prevent SSRF if a compromised IDP discovery endpoint returns a malicious JWKS URI pointing to an internal, unrelated, or different-port host
22+
- **Audience refresh singleflight deduplication** ([#472](https://github.com/telekom/k8s-breakglass/issues/472)): Periodic audience refresh from IdentityProvider now uses singleflight to prevent thundering herd when many requests arrive simultaneously after the refresh interval elapses
23+
- **Issuer trailing slash normalization** ([#472](https://github.com/telekom/k8s-breakglass/issues/472)): `LoadIdentityProviderByIssuer` now normalizes trailing slashes in both the incoming issuer and `IdentityProvider.spec.issuer` for the primary match, consistent with the auth layer's canonicalization
24+
- **JWT audience claim validation (SEC-005a)** ([#472](https://github.com/telekom/k8s-breakglass/issues/472)): Conditional JWT `aud` claim validation when `IdentityProvider.spec.oidc.expectedAudience` is configured. Prevents cross-service token confusion from other OIDC clients at the same provider. Requires a matching audience protocol mapper in the IDP. When unconfigured (default), audience validation is skipped for backwards compatibility
25+
- **JWT expiration required (SEC-005b)** ([#459](https://github.com/telekom/k8s-breakglass/issues/459)): JWT parser now rejects tokens without an `exp` claim via `jwt.WithExpirationRequired()`
3726
- **TLS minimum version (SEC-003)** ([#459](https://github.com/telekom/k8s-breakglass/issues/459)): Set `tls.VersionTLS12` as minimum on the API server and all OIDC proxy / JWKS HTTP clients
3827
- **X-Request-ID sanitization (SEC-004)** ([#459](https://github.com/telekom/k8s-breakglass/issues/459)): Validate `X-Request-ID` header (alphanumeric + `-_.:`; max 128 chars) and replace invalid values with a generated UUID to prevent log injection
3928
- **Build info endpoint hardened** ([#472](https://github.com/telekom/k8s-breakglass/issues/472), SEC-008): `/api/debug/buildinfo` now exposes only `version` and `buildDate`; infrastructure details (Go version, platform, commit hash) are omitted to prevent reconnaissance
@@ -43,6 +32,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
4332

4433
### Fixed
4534

35+
- **Auth and bgctl hardening follow-ups** ([#472](https://github.com/telekom/k8s-breakglass/issues/472)): Multi-IDP auth now returns accurate client messages for unknown-issuer vs transient/rate-limited JWKS failures, keeps audience-refresh cache updates eviction-safe, rejects issuer inputs that normalize to empty values, validates IDP authority URLs before OIDC discovery fallback, emits a stable `identity_provider=single-idp` label in single-IDP mode, and throttles repeated expected-audience refresh attempts after refresh errors to avoid per-request retry storms. `bgctl update` now uses command-scoped request contexts, separates metadata-request and binary-download HTTP timeouts, escapes release tags before building GitHub API paths, removes partially extracted binaries on extraction errors, and propagates non-EOF probe-read errors during size-limited archive extraction. `MarshalIdentityProviderToJSON` now emits a sanitized payload that omits runtime secrets (for example client secrets, service-account tokens, and raw provider config)
4636
- **Docs: wrong file reference** ([#530](https://github.com/telekom/k8s-breakglass/issues/530)): Fixed `.github/copilot-instructions.md` referencing non-existent `pkg/policy/evaluator.go``pkg/policy/deny.go`
4737
- **Docs: unimplemented exit codes** ([#542](https://github.com/telekom/k8s-breakglass/issues/542)): Removed documented exit codes 2-8 from `docs/cli.md` that were never implemented in `bgctl`
4838
- **Frontend: stable v-for key in BreakglassSessionReview** ([#535](https://github.com/telekom/k8s-breakglass/issues/535)): Use `metadata.name` as the primary `v-for` key with `bg.name` as fallback, removing the array-index-based fallback that caused incorrect component reuse during filtering
@@ -62,6 +52,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
6252
- **JWKS auth: reuse HTTP client** ([#529](https://github.com/telekom/k8s-breakglass/issues/529)): Replaced per-request HTTP client creation in OIDC discovery with a shared `defaultHTTPClient` on `AuthHandler` for connection pooling and reduced allocations
6353
- **Debug session metrics docs labels** ([#537](https://github.com/telekom/k8s-breakglass/issues/537)): Fixed incorrect label definitions for 10 of 13 debug session metrics in `docs/metrics.md` to match `pkg/metrics/metrics.go`
6454
- **ParseDuration bounds checking** ([#525](https://github.com/telekom/k8s-breakglass/issues/525)): Added maximum 365-day limit to `ParseDuration` to prevent integer overflow from extremely large day values (e.g., `999999999d`)
55+
- **E2E: retry on informer cache lag during session creation**: `CreateSession` and `CreateDebugSession` API helpers now retry on transient 403 ("no escalation found", "user not authorized for requested group") and 400 ("template not found") errors caused by informer cache propagation delay after creating escalation or template resources
56+
- **E2E: fix CreateSessionAndWaitForPending race condition**: Always poll the API to confirm the controller has reconciled the session status, rather than trusting the create response which may not reflect the persisted status subresource
57+
- **CI: increase Single-Cluster E2E job timeout from 45 to 60 minutes**: API E2E tests alone take ~30 minutes; the 45-minute job timeout was consistently hit before CLI E2E tests could complete
58+
- **CI: increase Multi-Cluster E2E test timeout from 30m to 45m**: The 280-test suite occasionally exceeded the 30-minute Go test timeout
59+
- **E2E: port-forward keepalive for CI stability**: All long-lived E2E port-forwards now use `while true` restart loops to auto-recover from idle timeouts, pod restarts, and network drops
60+
- **E2E: added missing MailHog port-forward in CI workflow**: The CI workflow's port-forward setup step killed all port-forwards from `kind-setup-single.sh` but did not restart the MailHog port-forward, causing notification e2e tests to fail with `connection refused` on port 8025
61+
- **E2E: increased OIDC token retry window**: Bumped token request retries from 5 to 8 attempts with capped 10s backoff (~60s total window) to tolerate port-forward reconnection delays
62+
- **E2E: increased GetToken retry window to 12 attempts**: Extended from 8 to 12 attempts (~120s total window) to tolerate Keycloak pod recovery during extended outages in CI
63+
- **E2E: added retry with exponential backoff to offline token requests**: `ObtainOfflineRefreshTokenWithRetry` now uses exponential backoff and callers increased from 3 to 8 attempts to survive port-forward restarts
64+
- **E2E: added retry to RequireKeycloakReachable pre-check**: The Keycloak reachability pre-check now retries 5 times with backoff instead of failing immediately on first port-forward drop
65+
- **E2E: added retry to ObtainClientCredentialsToken**: Client credentials token requests now retry 8 times with exponential backoff to tolerate Keycloak transient unavailability
66+
- **CI: increase API E2E and OIDC E2E Go test timeouts from 30m to 45m**: Keycloak token retry overhead during port-forward drops can consume 90+ seconds per incident, causing test suite timeouts at 30 minutes
6567

6668
### Removed
6769

api/v1alpha1/applyconfiguration/api/v1alpha1/oidcconfig.go

Lines changed: 16 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

api/v1alpha1/identity_provider_types.go

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,18 @@ type OIDCConfig struct {
7676
// +kubebuilder:validation:Pattern=`^\S+$`
7777
ClientID string `json:"clientID"`
7878

79+
// ExpectedAudience is the expected JWT audience (aud) claim value.
80+
// When set, the API server validates that incoming JWTs contain this value
81+
// in their aud claim. This prevents cross-service token confusion from other
82+
// OIDC clients at the same identity provider.
83+
// Requires a matching audience protocol mapper in the identity provider
84+
// that adds this value to the aud claim in issued tokens.
85+
// If empty, audience validation is skipped.
86+
// +optional
87+
// +kubebuilder:validation:MaxLength=253
88+
// +kubebuilder:validation:Pattern=`^\S+$`
89+
ExpectedAudience string `json:"expectedAudience,omitempty"`
90+
7991
// InsecureSkipVerify allows skipping TLS verification (NOT for production!)
8092
// +optional
8193
InsecureSkipVerify bool `json:"insecureSkipVerify,omitempty"`

config/crd/bases/breakglass.t-caas.telekom.com_identityproviders.yaml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,18 @@ spec:
177177
minLength: 1
178178
pattern: ^\S+$
179179
type: string
180+
expectedAudience:
181+
description: |-
182+
ExpectedAudience is the expected JWT audience (aud) claim value.
183+
When set, the API server validates that incoming JWTs contain this value
184+
in their aud claim. This prevents cross-service token confusion from other
185+
OIDC clients at the same identity provider.
186+
Requires a matching audience protocol mapper in the identity provider
187+
that adds this value to the aud claim in issued tokens.
188+
If empty, audience validation is skipped.
189+
maxLength: 253
190+
pattern: ^\S+$
191+
type: string
180192
insecureSkipVerify:
181193
description: InsecureSkipVerify allows skipping TLS verification
182194
(NOT for production!)

config/samples/breakglass_v1alpha1_clusterconfig_oidc_from_idp.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,9 @@ spec:
1515
oidc:
1616
authority: https://keycloak.example.com/realms/kubernetes
1717
clientID: breakglass-ui
18+
# Optional: Enable JWT audience validation. Requires a matching audience
19+
# protocol mapper in Keycloak that adds this value to the "aud" claim.
20+
# expectedAudience: breakglass-ui
1821

1922
# Keycloak configuration - provides service account for cluster auth
2023
keycloak:

docs/identity-provider.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ spec:
4444
|-------|------|----------|-------------|
4545
| `authority` | string | ✅ Yes | OIDC provider authority endpoint (e.g., `https://auth.example.com`). The frontend redirects users to this endpoint for authentication. |
4646
| `clientID` | string | ✅ Yes | OIDC client ID for the Breakglass UI (frontend). Configured in your OIDC provider. |
47+
| `expectedAudience` | string | ❌ No | Expected JWT `aud` claim value. When set, tokens must contain this audience. Requires a matching audience protocol mapper in your IDP. If empty, audience validation is skipped. |
4748
| `jwksEndpoint` | string | ❌ No | JWKS endpoint for key sets. Defaults to `{authority}/.well-known/openid-configuration` |
4849
| `insecureSkipVerify` | boolean | ❌ No | Skip TLS verification (NOT for production). Default: `false` |
4950
| `certificateAuthority` | string | ❌ No | PEM-encoded CA certificate for TLS validation |

docs/metrics.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -542,12 +542,12 @@ Track JWT token validation and JWKS key fetching performance.
542542

543543
| Metric | Type | Labels | Description |
544544
|--------|------|--------|-------------|
545-
| `breakglass_jwt_validation_requests_total` | Counter | `issuer`, `mode` | Total JWT validation attempts |
546-
| `breakglass_jwt_validation_success_total` | Counter | `issuer` | Successful JWT validations |
547-
| `breakglass_jwt_validation_failure_total` | Counter | `issuer`, `reason` | Failed JWT validations |
548-
| `breakglass_jwt_validation_duration_seconds` | Histogram | `issuer` | JWT validation latency |
549-
| `breakglass_jwks_cache_hits_total` | Counter | `issuer` | JWKS key cache hits |
550-
| `breakglass_jwks_cache_misses_total` | Counter | `issuer` | JWKS key cache misses |
545+
| `breakglass_jwt_validation_requests_total` | Counter | `identity_provider`, `mode` | Total JWT validation attempts |
546+
| `breakglass_jwt_validation_success_total` | Counter | `identity_provider` | Successful JWT validations |
547+
| `breakglass_jwt_validation_failure_total` | Counter | `identity_provider`, `reason` | Failed JWT validations |
548+
| `breakglass_jwt_validation_duration_seconds` | Histogram | `identity_provider` | JWT validation latency |
549+
| `breakglass_jwks_cache_hits_total` | Counter | `identity_provider` | JWKS key cache hits |
550+
| `breakglass_jwks_cache_misses_total` | Counter | `identity_provider` | JWKS key cache misses |
551551
| `breakglass_jwks_fetch_requests_total` | Counter | `issuer`, `status` | JWKS endpoint fetch attempts |
552552
| `breakglass_jwks_fetch_duration_seconds` | Histogram | `issuer` | JWKS endpoint fetch latency |
553553
| `breakglass_jwks_cache_size` | Gauge | `issuer` | Number of cached JWKS key sets |

0 commit comments

Comments
 (0)