Skip to content

Conversation

ishanrajsingh
Copy link

@ishanrajsingh ishanrajsingh commented Oct 3, 2025

Description

Fixes memory leak in responses.parse() caused by unbounded TypeAdapter caching in multi-threaded environments.

Problem

The current implementation uses an unbounded lru_cache keyed by dynamically generated generic types (ParsedResponseOutputMessage[MyClass]). In multi-threaded scenarios, Pydantic regenerates these types with different hashes, preventing cache hits and causing infinite memory growth.

Solution

  1. Bounded Cache: Limit cache size to 128 entries per thread
  2. Stable Keys: Use fully-qualified type names instead of type object hashes
  3. Thread Safety: Implement thread-local caching to prevent conflicts

Changes

  • Added src/openai/lib/_parsing/_type_adapter_cache.py with thread-safe caching
  • Modified src/openai/lib/_parsing/_completions.py to use new cache
  • Added comprehensive tests in tests/test_type_adapter_cache.py

Testing

  • Unit tests pass
  • Memory leak reproduction test passes
  • Multi-threaded scenarios tested
  • No performance regression

Closes

Fixes #2672

Checklist

  • Tests added/updated
  • Documentation updated (if needed)
  • Follows project coding standards

…cache

- Replace unbounded lru_cache with thread-safe bounded cache
- Use stable type names as cache keys instead of type hashes
- Implement thread-local caching to prevent hash conflicts
- Add comprehensive tests for thread safety and memory bounds

Fixes openai#2672
@ishanrajsingh ishanrajsingh requested a review from a team as a code owner October 3, 2025 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unrestricted caching keyed by generated types causes memory leak in multi-threaded regimes
1 participant