Skip to content

Cached endpoints for dumping data #598

@pederhan

Description

@pederhan

It has become a requirement to be able to dump all data about hosts. Iterating over all hosts, page by page, with a max page size of 1000 takes far too long with ~50000 hosts.

There are multiple reasons for this. Among them are:

  • No re-use of database connections between requests
  • Maximum page size of 1000 (this becomes exponentially worse because of the above)
  • Somewhat complex lookups required to construct a Host object (IPs, RRs, Policies, host groups, communities)
  • Django
  • Python runtime

Possible solutions

Cached /hosts endpoint with a larger maximum page size

Using the drf-extensions package, we can use the PaginateByMaxMixin mixin class, to provide a way for requestors to specify that they want all hosts in a single response.

We could then use the built-in Django caching mechanism (optionally augmented by diskcache's Django support) to cache the response to /hosts?page_size=max, which then serves as the only path to get a cached response from that endpoint.

Separate endpoint(s) for dumping data

MREG deals with more than just hosts. It could be a good idea to have a common endpoint for data dumping (potentially limited to a subset of users) that provides cached dumps of hosts, networks, RRs, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions