Skip to content

Conversation

@joegallo
Copy link
Contributor

@joegallo joegallo commented Mar 24, 2025

This PR adds another degree of freedom in the MaxmindIpDataLookups. Rather than requiring that the response from the Reader is necessarily the same object as we keep in the GeoIpCache, it adds a new record(...) method (read as a noun, like "the record for storing in the cache") that allows a given MaxMind IpDataLookup to build a different object out of the response and then cache that object.

In this PR, only the CountryResponse is refactored to use this new capability -- it's the smallest of the response objects that would benefit from a change, so it's the easiest one to review at the same time as the rest of the machinery changes.

We'll benefit from this new capability in two ways:

  • first, it will make cache hits faster -- we can update an IngestDocument from a record faster than we can update from a proper Response (as a simple example, the common getName() method of various responses iterates over a collection -- now we'll do that just once when we populate the cache rather than every time we have a cache hit)
  • second, it decreases the size of the objects in the cache by about a factor of ten, so in some subsequent PR, we could reasonably expect to cache way more of these objects (and have more cache hits as a result)

Or, to put it simply, things are faster and use less memory this way.

CityResponse and EnterpriseResponse use the null-object pattern here,
so postal is never null.
Allow a given lookup to create a cache-able record from the response
from the database. At the moment every lookup just uses its response
from the database as the cache-able record (this is fine).
so that it can be reused. Of course that means that we have to tweak
all the existing callers to use method access rather than field
access, hence the bigger diff than one might otherwise imagine.
@joegallo joegallo added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >refactoring Team:Data Management Meta label for data/management team auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 labels Mar 24, 2025
@joegallo joegallo requested a review from masseyke March 24, 2025 18:11
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

}

@Override
protected AnonymousIpResponse record(AnonymousIpResponse response) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of wonder if this method name is going to cause confusion for future maintainers, since as you've pointed out your first thought when you see this is the verb -- what am I recording? What about cacheableRecord?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, my unwavering love of short names is definitely at odds with the desire for a clear name in this case. I renamed it to cacheableRecord for now via 9dd8964, but I think I'm going to polish things a little more (on subsequent PRs).

Copy link
Member

@masseyke masseyke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

There's too many nearby things named record or result or response. So
here's a new word, 'entry'. A database has a bunch of entries in it.
@joegallo joegallo merged commit 8857ebf into elastic:main Mar 26, 2025
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

@joegallo joegallo deleted the refactor-ingest-geoip-maxmind-cache branch March 26, 2025 17:00
joegallo added a commit to joegallo/elasticsearch that referenced this pull request Mar 28, 2025
omricohenn pushed a commit to omricohenn/elasticsearch that referenced this pull request Mar 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >refactoring Team:Data Management Meta label for data/management team v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants