Skip to content

bug: Resource pool allocation times out with large IPv6 prefix pools causing gunicorn worker crash #7897

@BeArchiTek

Description

@BeArchiTek

Component

  • API Server / GraphQL

Infrahub version

1.6.1

Current Behavior

When allocating resources from a "large" IPv6 prefix pool, the server times out and a gunicorn worker dies.

Failing scenarios:

  • A /48 prefix pool used to create /127 prefixes → timeout + worker crash
  • 2x /56 prefix pools used to create /127 prefixes → timeout + worker crash

Working scenarios:

  • A /64 prefix pool used to create /127 prefixes → works fine
  • 1x /48 prefix pool used to create /64 prefixes → works fine

The issue appears to be related to the size of the resource pool and/or the number of potential allocations that need to be computed.

Expected Behavior

Resource allocation from large prefix pools should complete successfully without timing out or crashing the gunicorn worker, even if it takes longer to process.

Steps to Reproduce

  1. Create an IPv6 prefix pool with a /48 prefix as the resource
  2. Attempt to allocate /127 prefixes from this pool
  3. Observe timeout and gunicorn worker crash

Additional Information

  • A database backup has been provided to the engineering team for reproduction
  • The issue seems to scale with the potential allocation space (e.g., /48/127 = 2^79 possible allocations vs /64/127 = 2^63 possible allocations)
  • May require an investigation into how the resource pool allocation algorithm handles very large address spaces

Metadata

Metadata

Assignees

No one assigned

    Labels

    group/backendIssue related to the backend (API Server, Git Agent)priority/2This issue stalls work on the project or its dependents, it's a blocker for a releasetype/bugSomething isn't working as expected

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions