Skip to content

fix: Sanitize environment variable injection in script execution#153

Open
chefsale wants to merge 5 commits intomasterfrom
bounty-fix/sanitize-environment-variable-injection--ml7pwci8
Open

fix: Sanitize environment variable injection in script execution#153
chefsale wants to merge 5 commits intomasterfrom
bounty-fix/sanitize-environment-variable-injection--ml7pwci8

Conversation

@chefsale
Copy link
Member

@chefsale chefsale commented Feb 4, 2026

Sanitize environment variable names and values in script execution to prevent injection attacks.

Dynamic workflow values were previously injected as environment variables with only basic character replacement, posing a risk for environment variable injection. This PR introduces dedicated sanitization functions for both variable names and values, enforcing valid characters, length limits, and removing potentially malicious content like null bytes. New unit tests verify the effectiveness of these sanitization rules.


Open in Cursor Open in Web


Note

Medium Risk
Touches script execution by changing how workflow dynamic_values become environment variables; could break scripts that relied on unsanitized names/overwriting sensitive vars, but changes are localized and covered by unit tests.

Overview
Hardens ScriptStep local script execution by sanitizing dynamic_values before exporting them as environment variables, including strict name normalization/validation, a protected blocklist (e.g., PATH, LD_PRELOAD, proxy vars), and value cleaning/truncation to strip null/control characters.

Adds comprehensive unit tests covering name/value sanitization behavior, rejection reasons, sensitive-var blocking, and realistic workflow-variable patterns.

Written by Cursor Bugbot for commit 9ffc7a9. This will update automatically on new commits. Configure here.

@cursor
Copy link

cursor bot commented Feb 4, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@chefsale chefsale changed the title Environment variable sanitization fix: Sanitize environment variable injection in script execution Feb 4, 2026
Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 100% | Review time: 1049.3s

🟡 4 warnings, 💡 2 suggestions. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-977fad02

@cursor cursor bot force-pushed the bounty-fix/sanitize-environment-variable-injection--ml7pwci8 branch from b2b5786 to 17bde6f Compare February 13, 2026 17:24
Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 100% | Review time: 184.1s

🟡 1 warnings, 💡 3 suggestions. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-9c9f297f

if not ENV_VAR_NAME_PATTERN.match(sanitized):
return None

return sanitized
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Value sanitization allows control characters that could enable log injection

Only null bytes are removed from values; carriage return (\r) and other control characters could be used for terminal/log injection attacks when values are logged or displayed.

Suggested fix:

Consider stripping additional control characters (ASCII 0-31 except \n and \t) or at minimum \r to prevent log injection.

@cursor cursor bot force-pushed the bounty-fix/sanitize-environment-variable-injection--ml7pwci8 branch from 17bde6f to 6c32f27 Compare February 13, 2026 17:31
Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 93% | Review time: 135.9s

🟡 2 warnings, 💡 2 suggestions, 📝 1 nitpicks. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-bf2a88e4

@chefsale chefsale marked this pull request as ready for review February 14, 2026 08:13
Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 83% | Review time: 247.3s

🟡 1 warnings, 💡 4 suggestions, 📝 1 nitpicks. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-2d54fa30


# Convert to uppercase and replace common special characters
sanitized = name.upper().replace("-", "_").replace(".", "_")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Potential key collision causing silent data loss

Different input keys can sanitize to the same env var name (e.g., 'my-var' and 'my.var' both become 'MY_VAR'), causing the second value to silently overwrite the first.

Suggested fix:

Track seen sanitized names and either warn on collision, skip duplicates with a warning, or use a deterministic tie-breaker.

# An attacker might try to inject a null byte followed by a command
malicious = "legitimate_value\x00; rm -rf /"
result = _sanitize_env_var_value(malicious)
assert result == "legitimate_value; rm -rf /"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Misleading test name implies shell commands are sanitized

The test 'test_injection_attempt_with_null_and_command' and its docstring suggest the command is sanitized, but only the null byte is removed—the ; rm -rf / remains in the value, which could give false confidence about the function's protection scope.

Suggested fix:

Rename to `test_null_byte_removed_from_value` and update docstring to clarify that only null bytes are stripped, not shell metacharacters (which is intentional design).


# Convert to uppercase and replace common special characters
sanitized = name.upper().replace("-", "_").replace(".", "_")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Key collision allows value overwriting

Different input names like my-var and my.var both map to MY_VAR, allowing an attacker who controls multiple dynamic_values keys to overwrite intended values with malicious ones.

Suggested fix:

Consider logging a warning when a sanitized key already exists in the environment, or maintain a set of seen sanitized keys to detect and skip collisions.

from merobox.commands.bootstrap.steps.base import BaseStep
from merobox.commands.utils import console

# Constants for environment variable sanitization
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Document rationale for limit constants

The magic numbers 256 and 131072 would benefit from brief comments explaining why these values were chosen (e.g., POSIX limits, security thresholds).

Suggested fix:

Add inline comments like `# POSIX typically allows up to 32KB, 256 is conservative` or similar rationale.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 100% | Review time: 293.8s

🟡 2 warnings, 💡 3 suggestions. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-82b7b45b

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 95% | Review time: 226.1s

🟡 1 warnings, 💡 4 suggestions, 📝 2 nitpicks. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-e5b57d6d

CONTROL_CHARS_PATTERN = re.compile(r"[\x00-\x08\x0b-\x1f]")

# Blocklist of sensitive environment variable names that should not be overwritten
# These could be used for privilege escalation or code execution if modified
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing BASH_ENV and other shell initialization variables from blocklist

BASH_ENV is executed by bash in non-interactive mode before running commands, making it a privilege escalation vector; also missing ENV (used by sh/dash), CDPATH, SHELLOPTS, and BASH_OPTS.

Suggested fix:

Add 'BASH_ENV', 'ENV', 'CDPATH', 'SHELLOPTS', 'BASH_OPTS', 'POSIXLY_CORRECT' to SENSITIVE_ENV_VARS.

CONTROL_CHARS_PATTERN = re.compile(r"[\x00-\x08\x0b-\x1f]")

# Blocklist of sensitive environment variable names that should not be overwritten
# These could be used for privilege escalation or code execution if modified
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Consider blocking all LD_ and DYLD_ prefixes rather than specific variables**

Only specific LD_* variables are blocked but many others exist (LD_BIND_NOW, LD_AOUT_PRELOAD, LD_WARN, etc.); new ones may be added to linkers over time.

Suggested fix:

Add a prefix check in _sanitize_env_var_name to reject any name starting with 'LD_' or 'DYLD_' instead of maintaining a specific list.

@@ -263,11 +405,20 @@ async def _execute_local(
# This allows scripts to access workflow variables via $VAR_NAME
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Silent overwrites when different keys sanitize to the same name

If two input keys like 'my-var' and 'my.var' both sanitize to 'MY_VAR', the second silently overwrites the first, which could cause subtle bugs.

Suggested fix:

Track seen sanitized names and log a warning when a collision is detected before overwriting.

@@ -0,0 +1,314 @@
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Missing test for name collision scenario

No test verifies behavior when multiple input keys sanitize to the same environment variable name.

Suggested fix:

Add an integration test that passes dynamic_values like `{'my-var': 'a', 'my.var': 'b'}` and verifies expected behavior (either last-wins or detection).

CONTROL_CHARS_PATTERN = re.compile(r"[\x00-\x08\x0b-\x1f]")

# Blocklist of sensitive environment variable names that should not be overwritten
# These could be used for privilege escalation or code execution if modified
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Missing language-specific path variables from blocklist

NODE_PATH, PERL5LIB, RUBYLIB, GEM_PATH, and similar variables can be used to load attacker-controlled code in polyglot environments.

Suggested fix:

Add NODE_PATH, NODE_OPTIONS, PERL5LIB, PERL5OPT, RUBYLIB, RUBYOPT, GEM_PATH, GEM_HOME to SENSITIVE_ENV_VARS.

should properly quote variables (e.g., "$VAR" instead of $VAR) to prevent
command injection.
"""
if value is None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Nit: Redundant null byte removal

The explicit .replace("\x00", "") is redundant since CONTROL_CHARS_PATTERN already includes \x00 in the range \x00-\x08.

Suggested fix:

Remove line 140-141 (`str_value = str_value.replace("\x00", "")`) since the regex on line 143-144 will handle null bytes.

"LD_PROFILE",
"TMPDIR",
"TMP",
"TEMP",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Nit: Consider using a NamedTuple for the return type

The tuple return (sanitized_name, rejection_reason) could be more self-documenting as a NamedTuple like SanitizeResult.

Suggested fix:

Define `SanitizeResult = NamedTuple('SanitizeResult', [('name', str | None), ('rejection_reason', str | None)])` for clearer API.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

"TMP",
"TEMP",
}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocklist missing BASH_ENV allows arbitrary code execution

High Severity

The SENSITIVE_ENV_VARS blocklist is missing BASH_ENV, which is a critical variable for this context. When bash is invoked non-interactively (as with subprocess.run(["/bin/sh", ...], env=env)), it reads and executes the file specified by BASH_ENV before running the script. On systems where /bin/sh is bash (RHEL, CentOS, Fedora, older macOS), an attacker who controls a dynamic value keyed bash_env could point BASH_ENV to an attacker-controlled file and achieve arbitrary code execution, completely bypassing the sanitization this PR intends to provide.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 94% | Review time: 203.9s

🟡 2 warnings, 💡 3 suggestions, 📝 1 nitpicks. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-ed0600c3

CONTROL_CHARS_PATTERN = re.compile(r"[\x00-\x08\x0b-\x1f]")

# Blocklist of sensitive environment variable names that should not be overwritten
# These could be used for privilege escalation or code execution if modified
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing critical code execution variables from blocklist

BASH_ENV, ENV, and PROMPT_COMMAND are not blocked but can trigger arbitrary code execution when bash/sh starts or displays a prompt.

Suggested fix:

Add `BASH_ENV`, `ENV`, `PROMPT_COMMAND`, `SHELLOPTS`, `BASHOPTS`, and `PS4` to SENSITIVE_ENV_VARS.

@@ -263,11 +405,20 @@ async def _execute_local(
# This allows scripts to access workflow variables via $VAR_NAME
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Name collisions can silently overwrite earlier values

Different input keys can sanitize to the same env var name (e.g., 'my-var', 'my.var', 'my_var' all become 'MY_VAR'), causing silent data loss when the second overwrites the first.

Suggested fix:

Track seen sanitized names and either skip duplicates with a warning, or suffix with a counter to avoid collision.

CONTROL_CHARS_PATTERN = re.compile(r"[\x00-\x08\x0b-\x1f]")

# Blocklist of sensitive environment variable names that should not be overwritten
# These could be used for privilege escalation or code execution if modified
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Consider blocking language-specific module path variables

NODE_OPTIONS, NODE_PATH, PERL5LIB, RUBYLIB, and GEM_PATH can be exploited to load malicious code in their respective runtimes.

Suggested fix:

Add these variables to SENSITIVE_ENV_VARS if Node.js, Perl, or Ruby scripts may be executed.

# Convert to string
str_value = str(value)

# Remove null bytes which can cause issues
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Silent truncation of large values without logging

Values exceeding MAX_ENV_VAR_VALUE_LENGTH are silently truncated, which could cause hard-to-debug issues when legitimate large values are cut off.

Suggested fix:

Add a warning log when truncation occurs, similar to how rejected names are logged.

# Ensure it doesn't start with a digit
if sanitized and sanitized[0].isdigit():
sanitized = "_" + sanitized

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Silent truncation of long names may cause debugging confusion

Long variable names are silently truncated without any warning; users may not realize their variable was shortened, leading to unexpected mismatches.

Suggested fix:

Consider logging a warning (similar to the rejection case) when truncation occurs, so users can identify potential issues.

may be needed for legitimate use cases. Scripts consuming these values
should properly quote variables (e.g., "$VAR" instead of $VAR) to prevent
command injection.
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Nit: Redundant null byte removal before regex

The explicit replace("\x00", "") is redundant since CONTROL_CHARS_PATTERN already matches null bytes in the [\x00-\x08] range.

Suggested fix:

Remove the explicit null byte replacement and rely solely on the regex pattern.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

"TMP",
"TEMP",
}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sensitive env var blocklist missing BASH_ENV and ENV

Medium Severity

The SENSITIVE_ENV_VARS blocklist is missing BASH_ENV and ENV, which are well-known code execution vectors. BASH_ENV causes bash to execute the specified file when invoked non-interactively, and ENV serves a similar role for POSIX shells. While the direct /bin/sh invocation may not honor BASH_ENV in POSIX mode, any sub-scripts or commands that invoke bash directly would inherit and respect it, enabling arbitrary code execution. These are critical omissions for a blocklist explicitly designed to prevent "privilege escalation or code execution."

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants