Add security context with deserialization guard #1

iyehuda · 2025-10-22T10:58:54Z

Overview

Introduces a context-aware taint tracking mechanism to protect against
arbitrary code execution during pickle deserialization.

Core mechanism

Added security_ctx struct to PyContext with deserialization_taint_counter
New internal API to track deserialization state:
- _PyContext_IncrementDeserializationTaint()
- _PyContext_DecrementDeserializationTaint()
- _PyContext_IsDeserializationTainted()
Taint counter is incremented when entering pickle.loads() and decremented
on exit (both success and error paths)
Taint state propagates to new contexts created during deserialization

Design rationale

Extends PyContext (thread context) as it's the existing mechanism for context
variables and is natively supported by higher-level concurrency models like
asyncio. Storing the security state in the C struct prevents malicious user
code from overriding or bypassing the protection.

Protection via audit hooks

When deserialization is active, blocks dangerous operations including:
- System commands (os.system, subprocess.Popen)
- File modifications and thread creation
- Dynamic loading (ctypes) and network operations

Introduces a context-aware taint tracking mechanism to protect against arbitrary code execution during pickle deserialization. Core mechanism: - Added `security_ctx` struct to PyContext with deserialization_taint_counter - New internal API to track deserialization state: * _PyContext_IncrementDeserializationTaint() * _PyContext_DecrementDeserializationTaint() * _PyContext_IsDeserializationTainted() - Taint counter is incremented when entering pickle.loads() and decremented on exit (both success and error paths) - Taint state propagates to new contexts created during deserialization Design rationale: Extends PyContext (thread context) as it's the existing mechanism for context variables and is natively supported by higher-level concurrency models like asyncio. Storing the security state in the C struct prevents malicious user code from overriding or bypassing the protection. Protection via audit hooks: - When deserialization is active, blocks dangerous operations including: * System commands (os.system, subprocess.Popen) * Code execution (exec, compile) * File modifications and thread creation * Dynamic loading (ctypes) and network operations

- Updated `test_os_system_allowed_outside_pickle` to use `os.system('true')` for better clarity. - Improved handling of event loop policies in `test_taint_cleared_on_error` to ensure proper cleanup. - Removed outdated tests for `exec` and `compile` blocking, as they are no longer blocked. - Introduced a new `HARDEN_MODE` configuration to control the behavior of deserialization security checks, allowing for warnings or errors based on the mode set via the `PYTHONHARDENMODE` environment variable. - Fix failing tests

- Introduced a Dockerfile for building a Python environment based on Debian. - Configured essential packages and optimizations for Python installation. - Added a .dockerignore file to exclude unnecessary files and directories from the Docker context, improving build efficiency.

iyehuda added 3 commits October 19, 2025 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add security context with deserialization guard #1

Add security context with deserialization guard #1

iyehuda commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add security context with deserialization guard #1

Are you sure you want to change the base?

Add security context with deserialization guard #1

Conversation

iyehuda commented Oct 22, 2025

Overview

Core mechanism

Design rationale

Protection via audit hooks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant