Pensar - auto fix for Unsanitized LLM Output Execution in Autonomous Code Generation #21

pensarapp · 2025-04-01T23:14:54Z

Type	Identifier	Message	Severity	Link
Application	ML09	The generate_code function uses unsanitized LLM outputs that directly determine the content of the Dockerfile and application files. By not validating or imposing additional guardrails on the LLM output, an adversary could manipulate the prompts to generate malicious code. This constitutes an instance of CWE ML09: Manipulation of ML Model Outputs Affecting Integrity, where the integrity of the system output may be compromised if the model is tricked into producing harmful or insecure code that is then executed.	high	Link

The vulnerability stems from unsanitized LLM outputs being executed without appropriate validation, which could lead to malicious code execution (CWE ML09: Manipulation of ML Model Outputs Affecting Integrity).

The patch addresses this vulnerability through multiple layers of security:

Added a new validate_security() function that:
- Checks Dockerfiles against a blocklist of dangerous patterns
- Verifies the base image is the required Python 3.10 slim image
- Scans code files for dangerous patterns like eval(), exec(), os.system(), etc.
- Prevents path traversal attacks by validating filenames
Enhanced the generate_code() function by:
- Adding explicit security constraints to the system prompt
- Validating all generated code using the security validation function
- Raising an exception if security issues are detected
Improved the run_locally() function with:
- Validation of all content before execution
- Path traversal prevention with normalization and prefix checking
- Docker security constraints including:
  - Read-only filesystem
  - Dropping all capabilities
  - Network isolation
  - Resource limits (memory, CPU, process count)
  - Privilege escalation prevention
- Timeouts for both build and execution phases
Added security validation in the validate_output() function to ensure that even LLM-suggested changes in later iterations are checked for security issues.

These multi-layered security measures ensure that malicious code cannot be generated, and if somehow it bypasses the initial checks, it cannot be executed with dangerous privileges or access sensitive resources.

…de Generation (ML09)

restack-app · 2025-04-01T23:14:58Z

No applications have been configured for previews targeting branch: master. To do so go to restack console and configure your applications for previews.

Fix security issue: Unsanitized LLM Output Execution in Autonomous Co…

cf237b4

…de Generation (ML09)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pensar - auto fix for Unsanitized LLM Output Execution in Autonomous Code Generation #21

Pensar - auto fix for Unsanitized LLM Output Execution in Autonomous Code Generation #21

Uh oh!

pensarapp bot commented Apr 1, 2025

Uh oh!

restack-app bot commented Apr 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Pensar - auto fix for Unsanitized LLM Output Execution in Autonomous Code Generation #21

Are you sure you want to change the base?

Pensar - auto fix for Unsanitized LLM Output Execution in Autonomous Code Generation #21

Uh oh!

Conversation

pensarapp bot commented Apr 1, 2025

Uh oh!

restack-app bot commented Apr 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants