Skip to content

Fix cyclic AST recursion DoS (issue #196)#213

Merged
thomas-chauchefoin-tob merged 12 commits intomasterfrom
fix-issue-196-cyclic-ast-recursion
Jan 28, 2026
Merged

Fix cyclic AST recursion DoS (issue #196)#213
thomas-chauchefoin-tob merged 12 commits intomasterfrom
fix-issue-196-cyclic-ast-recursion

Conversation

@dguido
Copy link
Copy Markdown
Member

@dguido dguido commented Jan 22, 2026

Summary

  • Fix RecursionError when analyzing cyclic pickle structures (e.g., L = []; L.append(L))
  • Detect cycles at mutation points (Append, Appends, SetItem, SetItems, AddItems) and replace self-references with <cyclic-ref> placeholder AST nodes
  • Add has_cycles property to Pickled class for downstream awareness

Fixes #196

Test plan

  • test_cyclic_pickle_dos passes (previously marked as expected failure)
  • test_cyclic_pickle_creates_placeholder verifies placeholder is created
  • All 87 existing tests pass
  • Linter checks pass

🤖 Generated with Claude Code

dguido and others added 3 commits January 22, 2026 01:17
Adds new subcommands to expose PyTorchModelWrapper and polyglot module
functionality via the CLI while maintaining full backward compatibility.

New commands:
- fickling pytorch identify FILE      - Detect PyTorch format(s)
- fickling pytorch show FILE          - Decompile internal pickle
- fickling pytorch check-safety FILE  - Safety check internal pickle
- fickling pytorch inject FILE ...    - Inject payload into model

- fickling polyglot identify FILE     - Identify all possible formats
- fickling polyglot properties FILE   - File property analysis
- fickling polyglot create F1 F2 -o O - Create polyglot file

All commands support --json output and gracefully handle missing torch
dependency with helpful installation instructions.

Closes #101

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace nested subcommands (pytorch/polyglot) with flat, auto-detecting commands:
- fickling check FILE: Safety check any pickle/model (auto-detects format)
- fickling inject FILE -c CODE -o OUT: Inject payload into any format
- fickling info FILE: Show format and properties
- fickling create-polyglot F1 F2 -o OUT: Create polyglot files

Key changes:
- Add auto_load() to loader.py for automatic format detection
- Rewrite cli.py with flat command structure
- Delete cli_pytorch.py and cli_polyglot.py (absorbed into cli.py)
- Maintain full backward compatibility with legacy flags
- Update tests for new interface

The tool now figures out if input is PyTorch, TorchScript, plain pickle, etc.
without users needing to specify format explicitly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cyclic pickle structures (e.g., `L = []; L.append(L)`) caused
RecursionError when fickling analyzed them. This fix detects cycles
at mutation points (Append, Appends, SetItem, SetItems, AddItems)
and replaces self-references with `<cyclic-ref>` placeholder AST nodes.

Changes:
- Add `_has_cycle` flag to Interpreter to track cycle detection
- Modify mutation operations to check if value being added `is` the container
- Add `has_cycles` property to Pickled class for downstream awareness
- Update tests to verify the fix works correctly

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dguido dguido requested a review from ESultanik as a code owner January 22, 2026 08:18
dguido added a commit that referenced this pull request Jan 22, 2026
The cyclic AST recursion issue (#196) is being fixed separately in PR #213.
Mark this test as expected failure so PR #211 CI can pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This was erroneously added and later removed in PR #210. The visitor
override tracks visited nodes by id to prevent infinite recursion when
traversing cyclic AST structures created via MEMOIZE + GET opcodes.

Co-Authored-By: Dan Guido <dan@trailofbits.com>
Co-Authored-By: Claude Code <noreply@anthropic.com>
Cover dict value cycles, dict key cycles, and set cycles
to ensure the cycle detection feature handles all supported
mutation opcodes (SetItem, AddItems).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Match Python's pickle implementation which uses stack[-1] to peek
at the set without removing it from the stack.

Ref: https://github.com/python/cpython/blob/6ea3f8cd7ff4ff9769ae276dabc0753e09dca998/Lib/pickle.py#L1840

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Match Python's repr output for cyclic structures:
- List cycles: [[...]] instead of [<cyclic-ref>]
- Dict cycles: {'key': {...}} instead of {'key': <cyclic-ref>}
- Set cycles: {{...}} instead of {<cyclic-ref>}

This makes fickling's AST output directly comparable to what Python
displays when printing cyclic data structures.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Match Python's behavior for unhashable cyclic references:
- Dict key cycles: raise InterpretationError (Python raises TypeError)
- Set cycles: raise InterpretationError (Python raises TypeError)
- Dict value cycles: still allowed with {...} placeholder
- List cycles: still allowed with [...] placeholder

This distinguishes between valid cyclic structures that Python can
represent (lists containing themselves, dicts with self-referencing
values) and impossible structures that would fail at runtime.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@thomas-chauchefoin-tob
Copy link
Copy Markdown
Collaborator

  • Added tests for all cycle detection corner cases (list, dict value, dict key, set)
  • Fixed AddItems to peek instead of pop
  • Cyclic references now use Python-style repr with Ellipsis: [[...]], {'key': {...}}
  • Impossible cycles (dict key, set) now raise InterpretationError, like Python raises TypeError

@thomas-chauchefoin-tob thomas-chauchefoin-tob merged commit a22a9a9 into master Jan 28, 2026
11 checks passed
@thomas-chauchefoin-tob thomas-chauchefoin-tob deleted the fix-issue-196-cyclic-ast-recursion branch January 28, 2026 00:06
thomas-chauchefoin-tob added a commit that referenced this pull request Mar 19, 2026
…211)

* Add DoS protection against expansion attacks (Billion Laughs style)

Implement defense in depth against exponential expansion attacks that could
cause Fickling to hang or consume excessive memory:

1. Static pattern detection via ExpansionAttackAnalysis:
   - Detects high GET/PUT ratio (>10x suspicious, >50x likely unsafe)
   - Detects excessive DUP operations (>100 suspicious)

2. Runtime resource limits via InterpreterLimits:
   - max_opcodes: 1,000,000
   - max_stack_depth: 10,000
   - max_memo_size: 100,000
   - max_get_ratio: 50 (GETs per PUT)

3. New exception types:
   - ResourceExhaustionError for limit violations
   - ExpansionAttackError for expansion attack detection

4. Updated opcode classes to track GET/PUT operations:
   - BinGet, LongBinGet, Get call track_get()
   - BinPut, Put, LongBinPut, Memoize call track_put()

5. AnalysisContext catches ResourceExhaustionError and returns
   LIKELY_OVERTLY_MALICIOUS severity instead of propagating exception

Fixes #111

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Mark test_cyclic_pickle_dos as expected failure

The cyclic AST recursion issue (#196) is being fixed separately in PR #213.
Mark this test as expected failure so PR #211 CI can pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix review issues in DoS protection implementation

- Move _check_limits() after opcode.run() so counters are current
- Add ResourceExhaustionError handling in Pickled.ast (returns empty AST)
- Broaden AnalysisContext.analyze() catch to handle ValueError,
  IndexError, RecursionError from malformed pickles
- Handle put_count == 0 edge case in ExpansionAttackAnalysis
- Simplify combined indicators logic (remove redundant condition)
- Make InterpreterLimits frozen with __post_init__ validation
- Hardcode resource_type in ExpansionAttackError
- Use round() instead of int() for ratio in error messages
- Add ResourceExhaustionError handling in CLI decompile path
- Split resource limit tests per limit type (opcodes, stack, memo)
- Add InterpreterLimits validation test
- Remove @expectedfailure from test_cyclic_pickle_dos (now handled)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix failing linting

* Fix failing linting

* Make ExpansionAttackAnalysis thresholds configurable via __init__

Allows callers to override GET/PUT ratio and DUP count thresholds
without monkey-patching class attributes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add minimal test cases for nested Billion Laughs expansion patterns

Reproduces globalLaughs.pt (DUP-based) and billionLaughsAlt.pkl
(memo-based) DoS patterns that evade current flat thresholds by
spreading operations across nested LIST layers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Make resource exhaustion detection intentional via dedicated analysis

Previously detection depended on UnusedVariables accidentally
re-triggering the error. Now Pickled sets a flag and a dedicated
analysis checks it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Potential fix for pull request finding 'Unused import'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* Potential fix for pull request finding 'Imprecise assert'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* Potential fix for pull request finding 'Imprecise assert'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Thomas Chauchefoin <thomas.chauchefoin@trailofbits.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Denial of Service via Infinite Recursion on Cyclic Pickles

2 participants