Skip to content

Add graceful degradation and scan API for robust scanning#218

Merged
thomas-chauchefoin-tob merged 8 commits intomasterfrom
graceful-scan-api
Mar 21, 2026
Merged

Add graceful degradation and scan API for robust scanning#218
thomas-chauchefoin-tob merged 8 commits intomasterfrom
graceful-scan-api

Conversation

@dguido
Copy link
Copy Markdown
Member

@dguido dguido commented Jan 23, 2026

Summary

  • Adds ScanResult class with is_safe, severity, results, and errors attributes
  • Adds scan_file() function for graceful single-file scanning
  • Adds scan_archive() function for scanning ZIP archives with graceful error handling
  • Adds RelaxedZipFile class that ignores CRC validation errors (matches PyTorch behavior)
  • Exports new functions from package for easy programmatic use

API Example

import fickling

# Scan a single file
result = fickling.scan_file("model.pkl")
if not result:
    print(f"Unsafe: {result.severity}")
    for error in result.errors:
        print(f"  Error: {error}")

# Scan an archive
results = fickling.scan_archive("models.zip")
for name, result in results.items():
    print(f"{name}: {'safe' if result else 'unsafe'}")

Test plan

  • All existing tests pass
  • Linters pass
  • Manual testing with corrupted archives

🤖 Generated with Claude Code

- Add ScanResult class with is_safe, severity, results, and errors
- Add scan_file() for graceful single-file scanning
- Add scan_archive() for scanning ZIP archives
- Add RelaxedZipFile class that ignores CRC validation errors
- Add _scan_bytes() helper for in-memory scanning
- Export new functions from package

This API provides picklescan-like graceful degradation, continuing
to scan even when individual files fail to parse.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dguido dguido requested a review from ESultanik as a code owner January 23, 2026 02:48
dguido and others added 7 commits February 20, 2026 13:42
- Replace RelaxedZipFile CRC=0 hack (caused BadZipFile on valid files)
  with _expected_crc=None on the returned file handle
- Make ScanResult.is_safe a computed property instead of a stored bool;
  __bool__ now returns False when errors exist (incomplete scan != safe)
- Escalate severity in _scan_bytes when check_safety() or outer parse
  exceptions occur in graceful mode
- Deduplicate scan_file by delegating to _scan_bytes
- Change graceful default to False for scan_file and scan_archive
- Remove dead code: data-is-None branch, is_safe kwarg, read() override
- Add 12 tests covering scan_file, scan_archive, ScanResult, RelaxedZipFile

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move RelaxedZipFile from polyglot.py to loader.py to avoid requiring
  numpy/torch for scan_archive (RelaxedZipFile only needs stdlib zipfile)
- Add RuntimeWarning when _expected_crc attribute is missing on future
  Python versions so CRC bypass degradation is not silent
- Widen scan_archive outer exception handler to catch OSError (covers
  FileNotFoundError, PermissionError) in graceful mode, matching scan_file
- Separate archive.read() try/except from _scan_bytes() call so the
  inner catch only handles I/O errors, not analysis-layer exceptions
- Escalate _scan_bytes outer except Exception to LIKELY_UNSAFE (was
  SUSPICIOUS — for a security scanner, unanalyzable files should be
  treated as dangerous)
- Include exception type in all error messages for debuggability
- Add 8 new tests: archive nonexistent graceful/non-graceful, bad ZIP
  non-graceful raises, mixed good/bad archive members, all pickle
  extensions, is_safe boundary at POSSIBLY_UNSAFE, analysis error
  severity escalation, RelaxedZipFile CRC bypass regression

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Narrow file-open exception catch from Exception to OSError
- Rename scan_archive to scan_zip_archive to clarify ZIP-only scope
- Use PurePosixPath.suffix for extension parsing
- Remove .pt/.pth from extension filter (these are ZIP containers, not raw pickles)
- Update docstring to reference fickling.polyglot for other archive formats

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@thomas-chauchefoin-tob thomas-chauchefoin-tob merged commit 818a0ec into master Mar 21, 2026
12 checks passed
@thomas-chauchefoin-tob thomas-chauchefoin-tob deleted the graceful-scan-api branch March 21, 2026 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants