Skip to content

feat/implement environment-aware defaults and fail-closed secrets (US-1, US-2 #2595)#2847

Open
EleniKechrioti wants to merge 3 commits intoIBM:mainfrom
EleniKechrioti:feat/fail-closed-validation
Open

feat/implement environment-aware defaults and fail-closed secrets (US-1, US-2 #2595)#2847
EleniKechrioti wants to merge 3 commits intoIBM:mainfrom
EleniKechrioti:feat/fail-closed-validation

Conversation

@EleniKechrioti
Copy link

@EleniKechrioti EleniKechrioti commented Feb 11, 2026

Related Issue

Partially addresses #2595 . Implements US-1 (Environment-Aware Defaults) and US-2 (Fail-Closed on Unconfigured Secrets).


Summary

This PR implements the Fail-Closed security mechanism for US-2. It ensures the Gateway terminates execution if critical security secrets are missing, unconfigured, or weak when running in production mode.

Key Changes:

Environment-Aware Enforcement (US-1)

  • Dynamic Defaults: Implemented default_factory for require_strong_secrets, automatically enabling enforcement (true) in production and disabling it (false) in development.
  • Audit Logging: Added explicit security audit logs when enforcement is manually overridden in production environments.

Fail-Closed Mechanism (US-2)

  • Sentinel & Weak Secret Detection: Added logic to block execution if JWT_SECRET_KEY or AUTH_ENCRYPTION_SECRET are missing, weak, or set to default values.
  • Enforcement Logic: Enhanced get_settings() to trigger a SystemExit in production mode when critical security risks are detected.
  • Remediation Guidance: Implemented clear error messaging with specific error codes, pointing operators to the init_secrets script and official IBM security docs.

Documentation & Testing

  • Docs Update: Updated .env.example, CHANGELOG.md, and configuration.md to document the new environment-aware behavior and breaking changes.
  • Automated Testing: Added 6 comprehensive unit tests in tests/test_config_security.py covering all environment-specific scenarios.

Type of Change

  • Bug fix
  • Feature / Enhancement
  • Documentation
  • Refactor
  • Chore (deps, CI, tooling)

Verification

Check Command Status
Lint suite make lint Pass
Unit tests make test Pass
Coverage ≥ 80% make coverage Pass

Note: Verified specifically with pytest tests/test_config_security.py.


Checklist

  • Code formatted (make black isort pre-commit)
  • Tests added/updated for changes
  • Documentation updated (References added in logs)
  • No secrets or credentials committed

Notes

Manual verification performed:

  1. ENVIRONMENT=production with empty secrets correctly triggers SystemExit(1).
  2. ENVIRONMENT=development correctly proceeds with warnings.
  3. Remediation logs point to init_secrets script and official IBM documentation.

- Added detection for sentinel values ('', 'UNCONFIGURED')
- Defined weak/default secret constants for production validation
- Enhanced get_security_status() with specific error codes and risks
- Implemented log_critical_issues() with remediation and documentation links
- Enforced application termination (SystemExit) in production environments
- Added unit tests to verify security gates and development-mode bypass

Signed-off-by: Eleni Kechrioti <elenikehrioti@gmail.com>
@crivetimihai crivetimihai self-assigned this Feb 11, 2026
- Implement default_factory for REQUIRE_STRONG_SECRETS based on ENVIRONMENT
- Add audit logging for production security overrides
- Update .env.example, configuration reference, and CHANGELOG
- Add unit tests for production/development scenarios

Closes US-1

Signed-off-by: Eleni Kechrioti <elenikehrioti@gmail.com>
@EleniKechrioti EleniKechrioti changed the title feat/implement fail-closed enforcement for critical secrets (US2#2595) feat/implement environment-aware defaults and fail-closed secrets (US-1, US-2 #2595) Feb 11, 2026
@crivetimihai
Copy link
Member

Thanks @EleniKechrioti. Fail-closed on weak secrets in production is an important security improvement.

Code review observations:

  1. sys.exit(1) in get_settings(): This is called during module-level initialization, which makes tests that import config fragile. The tests handle this with pytest.raises(SystemExit) and cache_clear(), which works. But consider raising a custom exception (e.g., SecurityConfigurationError) that main.py catches and converts to sys.exit(1) — this separates validation from termination.
  2. Sentinel/weak value lists as ClassVar: Good use of ClassVar so these don't pollute the Pydantic model. However, WEAK_VALUES should probably also include "my-secret", "test-secret", etc. Consider using a minimum entropy check instead of/in addition to a blocklist.
  3. Double environment check: get_security_status() checks self.environment == "production" inside the loop, and get_settings() also checks cfg.environment == "production" before calling sys.exit(1). The _build_security_response already returns status="FAIL" — the outer check in get_settings() is redundant. Consider making get_security_status() return a consistent structure regardless of outcome.
  4. Test test_proceed_development_mode: Passes auth_encryption_secret="UNCONFIGURED" in dev mode and asserts status == "SUCCESS". This means sentinel values are silently accepted in dev with no warning. Consider at least logging a warning.
  5. CHANGELOG: Good addition documenting the breaking change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants