Skip to content

Restrict PickleHandler deserialization to safe builtins#5946

Open
White-Mouse wants to merge 3 commits into
crewAIInc:mainfrom
White-Mouse:codex/restrict-training-pickle
Open

Restrict PickleHandler deserialization to safe builtins#5946
White-Mouse wants to merge 3 commits into
crewAIInc:mainfrom
White-Mouse:codex/restrict-training-pickle

Conversation

@White-Mouse
Copy link
Copy Markdown

@White-Mouse White-Mouse commented May 27, 2026

PickleHandler.load() currently uses unrestricted pickle.load() to deserialize *.pkl artifacts, notably training_data.pkl and trained_agents_data.pkl. Because trained guidance files are loaded by agents during normal execution, a crafted pickle placed in the working directory can execute arbitrary code during deserialization (CWE-502).

This PR keeps the existing on-disk format but hardens loads by using a RestrictedUnpickler allowlist for primitive/container builtins only. This preserves the expected CrewAI training artifact shapes while rejecting unsafe globals such as builtins.exec or os.system.

Compatibility note

This change intentionally fails closed for legacy pickle artifacts that require custom classes, Pydantic models, datetime/path objects, or other non-primitive globals. CrewAI training artifacts should contain JSON-like primitive/container data. If a user has a custom or manually edited pickle that depends on arbitrary class imports, it will now be rejected instead of imported during load.

Tests

  • Existing save/load coverage for simple dictionaries still passes.
  • Added round-trip coverage for training_data.pkl-shaped data: agent id -> iteration -> initial_output / human_feedback / improved_output.
  • Added round-trip coverage for trained_agents_data.pkl-shaped data: role -> suggestions / quality / final_summary.
  • Added a malicious pickle regression test using reduce -> builtins.exec, with an assertion that the side effect marker is not set.

Verified locally:

  • uv run pytest -q lib/crewai/tests/utilities/test_file_handler.py
  • uv run ruff check --isolated lib/crewai/tests/utilities/test_file_handler.py
  • uv run ruff format --check --isolated lib/crewai/tests/utilities/test_file_handler.py

Summary by CodeRabbit

  • Bug Fixes

    • Improved security when loading training artifacts by restricting which serialized object types are accepted, reducing risk from malicious payloads.
    • Preserved existing behavior for missing or incomplete artifact files (returns empty result).
  • Tests

    • Added round‑trip tests to verify artifact data shapes are preserved on save/load.
    • Added a security test to confirm unsafe serialized payloads are rejected.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 37389f7a-009d-4f5f-90f9-0a0ce65a1924

📥 Commits

Reviewing files that changed from the base of the PR and between c45c9f6 and f3fcca0.

📒 Files selected for processing (1)
  • lib/crewai/tests/utilities/test_file_handler.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/crewai/tests/utilities/test_file_handler.py

📝 Walkthrough

Walkthrough

Adds a restricted pickle unpickler with a module allowlist and updates PickleHandler.load() to use it; tests verify round-trip loads and that crafted malicious pickle payloads are rejected.

Changes

Pickle Deserialization Security

Layer / File(s) Summary
Restricted unpickler implementation
lib/crewai/src/crewai/utilities/file_handler.py
Module-private allowlist constant defines permitted global types; custom _RestrictedUnpickler subclass intercepts global unpickling and raises pickle.UnpicklingError for disallowed globals.
Integration into PickleHandler.load()
lib/crewai/src/crewai/utilities/file_handler.py
PickleHandler.load() now deserializes via _RestrictedUnpickler(file).load() instead of pickle.load(file), enforcing the allowlist during artifact deserialization.
Tests: round-trips and unsafe pickle rejection
lib/crewai/tests/utilities/test_file_handler.py
Adds import pickle, two round-trip tests for training data and trained agents artifact shapes, and test_load_rejects_unsafe_pickle_globals which crafts a hostile pickle and asserts PickleHandler.load() raises pickle.UnpicklingError and does not execute the injected payload.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I guard the jar where pickles sleep,
A whitelist key I safely keep,
No crafty code may creep inside,
The rabbit nods—safe data abide. 🥒🔒

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: restricting PickleHandler deserialization to safe builtins only, which is the core security improvement in this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/tests/utilities/test_file_handler.py`:
- Line 84: The test currently uses a fixed environment key
"CREWAI_PICKLE_HANDLER_EXPLOITED" (marker) and unconditionally deletes it, which
can remove pre-existing environment state; change the test to generate a unique
marker (e.g., append a UUID) or otherwise derive a temp env var name, save the
original value using original = os.environ.get(marker) before mutating, perform
the exploit assertions, and then restore the environment in a finally block: if
original is None remove the key, otherwise set it back to original; apply this
same pattern to the other block covering lines 95-101 so the test does not
delete or overwrite external env state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 007310e1-421d-47e7-a183-e3fea7d167c5

📥 Commits

Reviewing files that changed from the base of the PR and between 9eb7b1a and c45c9f6.

📒 Files selected for processing (1)
  • lib/crewai/tests/utilities/test_file_handler.py

Comment thread lib/crewai/tests/utilities/test_file_handler.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants