Skip to content

fix(kratos): improve init container OIDC validation and error messages#23

Merged
colek42 merged 4 commits intomainfrom
fix/kratos-init-container-oidc-validation
Nov 10, 2025
Merged

fix(kratos): improve init container OIDC validation and error messages#23
colek42 merged 4 commits intomainfrom
fix/kratos-init-container-oidc-validation

Conversation

@colek42
Copy link
Member

@colek42 colek42 commented Nov 10, 2025

Summary

Fixes customer issue where missing OIDC credentials resulted in nil values and Kratos startup failures with unhelpful error messages.

Problem

Customer reported this error:

selfservice.methods.oidc.config.providers.0.client_id: <nil>
selfservice.methods.oidc.config.providers.0.client_secret: <nil>

Root cause: The init container was replacing ${OIDC_GITHUB_CLIENT_ID} placeholders with empty strings when environment variables weren't set, producing nil values.

Solution

Improved the Kratos init container to:

  1. Detect if OIDC is enabled - Only validates credentials when OIDC placeholders exist in config
  2. Fail fast with clear errors - Exits immediately when OIDC enabled but credentials missing
  3. Show ALL required secrets - Lists complete set of required Kubernetes secret keys
  4. Provide exact troubleshooting steps - Templated kubectl commands with actual secret names
  5. Use correct secret generation - openssl rand -hex 16 (32 hex chars exactly)

Error Message Now Shows

=========================================
FATAL: Missing OIDC Credentials
=========================================

Required keys for Kratos:
- dsn                     (Database connection string)
- secretsCookie           (Cookie encryption, min 16 chars, 32 recommended)
- secretsCipher           (Field encryption, MUST be exactly 32 chars)
- oidcGithubClientId      (GitHub OAuth Client ID) ← MISSING
- oidcGithubClientSecret  (GitHub OAuth Secret) ← MISSING

Example: Create the complete secret manually:
kubectl create secret generic kratos \
  --from-literal=dsn='postgres://user:pass@host:5432/kratos' \
  --from-literal=secretsCookie=$(openssl rand -hex 16) \
  --from-literal=secretsCipher=$(openssl rand -hex 16) \
  --from-literal=oidcGithubClientId='YOUR_GITHUB_CLIENT_ID' \
  --from-literal=oidcGithubClientSecret='YOUR_GITHUB_SECRET'

Testing

Validated using TDD/TCR methodology with black-box tests:

  • ✅ OIDC enabled + missing credentials → Fails with helpful error
  • ✅ OIDC enabled + empty credentials → Fails with helpful error
  • ✅ OIDC enabled + partial credentials → Fails with helpful error
  • ✅ OIDC enabled + valid credentials → Succeeds with validation
  • ✅ OIDC disabled + no credentials → Succeeds, skips validation

Changes

  • charts/kratos/templates/deployment-kratos.yaml: Improved init container logic
  • charts/kratos/Chart.yaml: Bump version 1.6.31 → 1.6.34
  • charts/judge/Chart.yaml: Bump version 1.8.40 → 1.8.43

Secret Length Requirements

Verified against Kratos source code:

  • secretsCipher: MUST be exactly 32 chars (validated at config.go:831-849)
  • secretsCookie: Min 16 chars, 32 recommended (per config.schema.json)
  • Command openssl rand -hex 16 produces exactly 32 hex characters

Commits

  1. feb360a - Main fix with conditional OIDC validation
  2. e8e86db - Show ALL required secret keys in error
  3. 92070af - Use correct secret generation (openssl rand -hex 16)

Co-Authored-By: Claude noreply@anthropic.com

cole-rgb and others added 4 commits November 10, 2025 01:35
…port

- Detect if OIDC is enabled by checking for placeholders in config
- Only validate OIDC credentials when OIDC is actually enabled
- Fail fast with helpful error messages when OIDC enabled but credentials missing
- Show troubleshooting steps with exact kubectl commands
- Validate substitution worked (no remaining placeholders)
- Check for empty client_id/client_secret that cause nil errors
- Mask sensitive values in logs (show first 4 chars only)
- Keep secrets as optional: true (OIDC is opt-in feature)

Fixes customer issue where missing oidcGithubClientId/oidcGithubClientSecret
in Kubernetes secret resulted in nil values and Kratos startup failures.

The init container now:
- Skips validation when OIDC is disabled
- Provides clear guidance when OIDC is enabled but secrets are missing:
  1. Which secret name to look for (templated with release name)
  2. Required secret keys (oidcGithubClientId, oidcGithubClientSecret)
  3. How to check ExternalSecret status
  4. How to create the secret manually
  5. How to disable OIDC if not needed

Tested with:
- OIDC enabled + missing credentials (fails with helpful errors) ✅
- OIDC enabled + empty credentials (fails with helpful errors) ✅
- OIDC enabled + partial credentials (fails with helpful errors) ✅
- OIDC enabled + valid credentials (succeeds with validation) ✅
- OIDC disabled + no credentials (succeeds, skips checks) ✅

Changes:
- charts/kratos/templates/deployment-kratos.yaml: Updated init container logic
- charts/kratos/Chart.yaml: Bump version 1.6.31 -> 1.6.32
- charts/judge/Chart.yaml: Bump version 1.8.40 -> 1.8.41, update kratos dependency

Co-Authored-By: Claude <noreply@anthropic.com>
When OIDC credentials are missing, the error message now shows the complete
list of required Kratos secret keys, not just the OIDC ones.

Required secrets shown to customers:
- dsn (Database connection string) - REQUIRED
- secretsCookie (Cookie encryption) - REQUIRED
- secretsCipher (Field encryption) - REQUIRED
- oidcGithubClientId (GitHub OAuth) - Required only if OIDC enabled
- oidcGithubClientSecret (GitHub OAuth) - Required only if OIDC enabled

Optional secrets NOT shown (to avoid confusion):
- secretsDefault - Auto-generated by Kratos if missing (verified via source)
- smtpConnectionURI - Only needed if SMTP configured

Also updated the example kubectl command to show how to create the complete
secret with all required keys including random generation for cookie/cipher.

Changes:
- charts/kratos/templates/deployment-kratos.yaml: Enhanced error messaging
- charts/kratos/Chart.yaml: Bump version 1.6.32 -> 1.6.33
- charts/judge/Chart.yaml: Bump version 1.8.41 -> 1.8.42, update dependency

Co-Authored-By: Claude <noreply@anthropic.com>
…and -hex 16)

Changed secret generation command from complex tr/urandom to simple openssl:
- OLD: LC_ALL=C tr -dc 'A-Za-z0-9' < /dev/urandom | head -c 32
- NEW: openssl rand -hex 16

This produces exactly 32 hexadecimal characters, which:
✅ Meets secretsCipher requirement (MUST be exactly 32 chars)
✅ Meets secretsCookie recommendation (min 16, recommended 32)
✅ Matches existing README.md documentation
✅ Simpler and more portable than tr/urandom approach

Updated error message details:
- secretsCookie: min 16 chars, 32 recommended (per Kratos schema)
- secretsCipher: MUST be exactly 32 chars (validated by Kratos)
- Both use same command for consistency

Verified against Kratos source code:
- secretsCipher validation: config.go lines 831-849
- Filters out secrets that aren't exactly 32 bytes
- secretsCookie minimum: config.schema.json (16 char minimum)

Changes:
- charts/kratos/templates/deployment-kratos.yaml: Updated secret generation
- charts/kratos/Chart.yaml: Bump version 1.6.33 -> 1.6.34
- charts/judge/Chart.yaml: Bump version 1.8.42 -> 1.8.43, update dependency

Co-Authored-By: Claude <noreply@anthropic.com>
Resolved conflicts by merging main changes with our OIDC validation fix.

Version bumps to account for both sets of changes:
- charts/kratos/Chart.yaml: 1.6.31 (main) + our changes → 1.6.35
- charts/judge/Chart.yaml: 1.8.40 (main) + our changes → 1.8.44

Changes from main (merged):
- archivista 1.6.19 → 1.6.20
- judge-api 1.6.19 → 1.6.20
- kratos template improvements (_helpers.tpl, cleanup-cron-job, etc.)
- judge manual secrets configuration

Our changes (preserved):
- Improved Kratos init container OIDC validation
- Better error messages with all required secrets listed
- Correct secret generation (openssl rand -hex 16)

Rebuilt dependencies with kratos 1.6.35.
@github-actions
Copy link

Helm Dependencies Check Failed

Stale Helm dependencies detected! This means .tgz files are older than source files.

How to fix:

  1. Run make deps in your local repo
  2. Commit the updated .tgz files
  3. Push your changes

Why this matters: Stale dependencies cause ArgoCD to deploy outdated configs, leading to issues like missing Vault annotations.

See the Makefile for more details.

@colek42 colek42 merged commit 4029e2c into main Nov 10, 2025
9 of 15 checks passed
@colek42 colek42 deleted the fix/kratos-init-container-oidc-validation branch November 10, 2025 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments