fix(_logs): redact sensitive headers interpolated into log message strings#3200
fix(_logs): redact sensitive headers interpolated into log message strings#3200Ghraven wants to merge 1 commit intoopenai:mainfrom
Conversation
… in log messages Fixes openai#1196 The existing `SensitiveHeadersFilter` only redacts headers when they are passed as a structured `dict` in `record.args`. However, when httpx logs at `DEBUG` level it interpolates headers directly into the message string (e.g. `"headers={'authorization': 'Bearer sk-...'}"`), bypassing the dict-path entirely and leaving the API key visible in plain text. Fix: add a second pass that inspects the fully-formatted log message for `header: value` patterns matching `SENSITIVE_HEADERS` and replaces the value with `<redacted>`. When a substitution is made, `record.msg` is replaced with the sanitised string and `record.args` is reset so Python's logging machinery does not re-interpolate the original value.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3266c284a2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| pattern = rf"(?i)({re.escape(header)}['"]?\s*:\s*['"]?)([^'"\s,}}]+)" | ||
| redacted = re.sub(pattern, r"\1<redacted>", msg) |
There was a problem hiding this comment.
Redact full sensitive value, including spaces
The new regex only replaces characters up to the first whitespace ([^'"\s,}}]+), so an Authorization value like Bearer sk-... becomes <redacted> sk-... and still leaks the token. This is the common HTTP auth format, so debug logs can continue exposing credentials even after this fix.
Useful? React with 👍 / 👎.
| if redacted != msg: | ||
| record.msg = redacted | ||
| record.args = () | ||
| break |
There was a problem hiding this comment.
Redact every sensitive header occurrence in a message
The loop stops at the first matched header (break), so if a single log line contains multiple sensitive headers (e.g., both authorization and api-key), only one is scrubbed and the other remains visible. This creates partial redaction and leaves secrets in logs for multi-header requests.
Useful? React with 👍 / 👎.
Fixes #1196
Problem
SensitiveHeadersFilteronly redacts headers when they appear as a structureddictinrecord.args. When httpx logs atDEBUGlevel it interpolates headers directly into the message string — e.g."headers={'authorization': 'Bearer sk-...'}"— bypassing the dict-path entirely and leaving the API key visible in plain text debug logs.Fix
Add a second pass that inspects the fully-formatted log message for
header: valuepatterns matchingSENSITIVE_HEADERSand replaces the value with<redacted>. When a substitution is made,record.msgis replaced with the sanitised string andrecord.argsis reset so Python's logging machinery does not re-interpolate the original.Both paths now apply: dict-based structured logs (existing) and string-formatted messages (new).