Redact sensitive env vars in exectools command logging#2694
Conversation
cmd_gather logs the full set_env dict, which leaks secrets like GIT_PASSWORD (GitHub App installation tokens) into Jenkins build logs. Add _redact_env_for_logging() to replace values of env vars whose names contain PASSWORD, TOKEN, SECRET, KEY, or CREDENTIAL with ***REDACTED***. Made-with: Cursor Entire-Checkpoint: d8832137af58 rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
WalkthroughThis change introduces environment variable redaction for logging purposes. A new internal function detects environment variable keys matching sensitive patterns (passwords, tokens, secrets, credentials) and replaces their logged values with Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
artcommon/artcommonlib/exectools.py (1)
39-48: Consider boundary-based key matching to reduce over-redaction noiseCurrent substring matching (
"KEY" in key) can redact unrelated env vars (e.g., names containingMONKEY), which may reduce debugging value. A boundary-aware regex keeps the security intent while limiting accidental matches.Optional refinement
+import re @@ -_SENSITIVE_ENV_KEY_PATTERNS = frozenset({'PASSWORD', 'TOKEN', 'SECRET', 'KEY', 'CREDENTIAL'}) +_SENSITIVE_ENV_KEY_RE = re.compile( + r'(^|_)(PASSWORD|TOKEN|SECRET|CREDENTIAL|KEY)(_|$)', re.IGNORECASE +) @@ def _redact_env_for_logging(env_dict: Dict[str, str]) -> Dict[str, str]: """Return a copy of env_dict with values redacted for keys that look secret.""" - upper_key_cache = {k: k.upper() for k in env_dict} return { - k: '***REDACTED***' if any(p in upper_key_cache[k] for p in _SENSITIVE_ENV_KEY_PATTERNS) else v + k: '***REDACTED***' if _SENSITIVE_ENV_KEY_RE.search(k) else v for k, v in env_dict.items() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@artcommon/artcommonlib/exectools.py` around lines 39 - 48, The current _redact_env_for_logging uses simple substring checks against _SENSITIVE_ENV_KEY_PATTERNS which over-redacts keys like MONKEY; change the matching to boundary-aware checks by building regexes for each pattern (e.g., word or underscore boundaries) and testing the upper-cased key with those regexes instead of "any(p in upper_key_cache[k])"; update the logic inside _redact_env_for_logging (which uses upper_key_cache and iterates env_dict.items()) to compile or reuse these boundary-aware patterns and only redact when a pattern match is found.artcommon/tests/test_exectools.py (1)
304-306: Use neutral test token literals to avoid secret-scanner noise
secret_valand a realistic token-like value can trigger false positives in some security scanners. Consider using a neutral placeholder to keep scans cleaner.Optional test-only tweak
- secret_val = "ghs_SuperSecretToken123" - env = {"GIT_PASSWORD": secret_val, "GIT_ASKPASS": "/tmp/askpass.sh"} + test_token = "dummy-token-value" + env = {"GIT_PASSWORD": test_token, "GIT_ASKPASS": "/tmp/askpass.sh"} @@ - self.assertNotIn(secret_val, log_text) + self.assertNotIn(test_token, log_text)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@artcommon/tests/test_exectools.py` around lines 304 - 306, The test uses a realistic token string in the secret_val variable which can trigger secret scanners; replace secret_val (and any direct realistic token literals) with a neutral placeholder like "TEST_GIT_TOKEN" or "neutral-token" and keep the env mapping to "GIT_PASSWORD": secret_val (the code around secret_val and env and the test that patches subprocess.Popen should be updated). Ensure you update any similar test literals in this file (e.g., other uses of secret_val or realistic token-like strings) to the neutral placeholder to avoid scanner false positives.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@artcommon/artcommonlib/exectools.py`:
- Around line 39-48: The current _redact_env_for_logging uses simple substring
checks against _SENSITIVE_ENV_KEY_PATTERNS which over-redacts keys like MONKEY;
change the matching to boundary-aware checks by building regexes for each
pattern (e.g., word or underscore boundaries) and testing the upper-cased key
with those regexes instead of "any(p in upper_key_cache[k])"; update the logic
inside _redact_env_for_logging (which uses upper_key_cache and iterates
env_dict.items()) to compile or reuse these boundary-aware patterns and only
redact when a pattern match is found.
In `@artcommon/tests/test_exectools.py`:
- Around line 304-306: The test uses a realistic token string in the secret_val
variable which can trigger secret scanners; replace secret_val (and any direct
realistic token literals) with a neutral placeholder like "TEST_GIT_TOKEN" or
"neutral-token" and keep the env mapping to "GIT_PASSWORD": secret_val (the code
around secret_val and env and the test that patches subprocess.Popen should be
updated). Ensure you update any similar test literals in this file (e.g., other
uses of secret_val or realistic token-like strings) to the neutral placeholder
to avoid scanner false positives.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5f009f8c-fe2b-4b34-abd3-757dd307e33a
📒 Files selected for processing (2)
artcommon/artcommonlib/exectools.pyartcommon/tests/test_exectools.py
|
@ashwindasr: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pruan-rht The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
4a87e96
into
openshift-eng:main
Summary
cmd_gatherinexectools.pylogs the entireset_envdict, which leaks secrets likeGIT_PASSWORD(GitHub App installation tokensghs_*) into Jenkins build logs_redact_env_for_logging()helper that replaces values of env vars whose names contain PASSWORD, TOKEN, SECRET, KEY, or CREDENTIAL with***REDACTED***gen-assemblybuild Add e2e tests for scos okd #1092 logs showed the fullGIT_PASSWORDvalue in plaintextTest plan
_redact_env_for_loggingcovering sensitive keys, non-sensitive keys, case insensitivity, and empty dictcmd_gatherwith a secret env var verifies the secret does not appear in log outputtest_exectools.pypassruff checkandruff formatpassMade with Cursor