Register IngestGeoIpMetadata as a NamedXContent #123079

joegallo · 2025-02-20T21:52:44Z

If you upgrade Elasticsearch via a full cluster restart, then the DatabaseConfigurationMetadata entries from the cluster state get cleared out. This is easiest to reproduce by putting a single node cluster through an upgrade OR a restart. In either of those cases, we restore the cluster state by reading the x-content from disk (rather than serializing it over the network), so we're exercising an entirely different path than in the case of the ordinary operations of a cluster, or the case of rolling (node-by-node) upgrades.

I renamed the existing full-cluster-restart test to geoip-reindexed to better capture the purpose of the test, and I added a new full-cluster-restart test that would have caught wrong behavior here (note: it really doesn't do anything more than that, though, but I'd argue that's enough for right now).

As a matter of inside baseball, the previous paragraph means that you'd almost certainly be better off reviewing this PR commit by commit rather than reading the overall diff of the PR. But do as you'd like.

This makes this class more like SnapshotLifecycleMetadata and IndexLifecycleMetadata, but I don't think this is load bearing code. The previous String literal version was more like EnrichMetadata, so there's prior art for that, but it feels more like an odd duck to me.

elasticsearchmachine · 2025-02-20T21:53:20Z

Hi @joegallo, I've created a changelog YAML for you.

DaveCTurner · 2025-02-21T09:12:36Z

In terms of tests, I would expect ./gradlew :modules:ingest-geoip:qa:full-cluster-restart:check to pick this up. It seems that it doesn't, which indicates maybe this test needs to be expanded.

Also when I ran that task I saw that it only attempts an upgrade from v8.19.0. If it's trying to test BwC then it should be picking up all compatible versions right? And it should definitely be doing a full-cluster-restart without an upgrade too.

DaveCTurner · 2025-02-21T09:21:56Z

By "pick this up" I mean I would expect some assertion somewhere to trip, if we wrote out some NamedXContent to the cluster state and then couldn't read it back in again. But then I see that we just quietly ignore that kind of issue even in tests:

elasticsearch/server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java

Lines 2684 to 2686 in 2ee7ca4

    
           } catch (NamedObjectNotFoundException ex) { 
        
               logger.warn("Skipping unknown custom object with type {}", currentFieldName); 
        
               parser.skipChildren();

DaveCTurner · 2025-02-21T09:31:40Z

FTR we don't hit that lenient code in ./gradlew :modules:ingest-geoip:qa:full-cluster-restart:check anyway, in the sense that this task still passes even with the following:

diff --git a/server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java b/server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java
index 35e853cdd55a..bc7e08fd76eb 100644
--- a/server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java
+++ b/server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java
@@ -2682,6 +2682,7 @@ public class Metadata implements Iterable<IndexMetadata>, Diffable<Metadata>, Ch
                             Custom custom = parser.namedObject(Custom.class, currentFieldName, null);
                             builder.putCustom(custom.getWriteableName(), custom);
                         } catch (NamedObjectNotFoundException ex) {
+                            assert false : ex;
                             logger.warn("Skipping unknown custom object with type {}", currentFieldName);
                             parser.skipChildren();
                         }

So there's at least three test problems here:

We don't actually run full-cluster-restart tests to cover very many relevant scenarios.
Even in the ones we do cover, we don't try and insert the problematic object in the cluster state first.
Even if we did insert such a problematic object, the tests wouldn't automatically fail when encountering a missing NamedXContent registration.

joegallo · 2025-02-21T12:32:41Z

I happen to know the history here -- I wrote the FullClusterRestartIT three years ago in #85792 specifically to cover the case of upgrading and using the system indices migration feature, that's why it covers such a limited number of upgrade-from versions. Reading between the lines of what I know now from this conversation, that test is poorly named -- we don't have (and never have had) a true FullClusterRestartIT for the geoip code.

I'll rename the test to better reflect its true purpose, and write a new FullClusterRestartIT that would have failed without the fix on this PR.

I'll add the assert you mentioned, too. I'd already found that line of code and had a big loop of printlns there to figure out why it was claiming that there wasn't any way to handle "ingest_geoip" when I'd clearly registered all the relevant NamedWriteables (I'm in the future, too, so I understand that this thinking was not right, although I can't actually explain confidently why it isn't right).

However, I think we should go one more: I'm curious what it would look like to add an assert that makes sure that a cluster state that we write to disk is deserializable from disk (immediately after writing). This would have failed on the code prior to this PR, and it wouldn't have required a test that we don't have, and that, indeed, I didn't realize I had to write.

DaveCTurner · 2025-02-21T13:14:39Z

However, I think we should go one more: I'm curious what it would look like to add an assert that makes sure that a cluster state that we write to disk is deserializable from disk (immediately after writing).

We already have this, see org.elasticsearch.gateway.PersistedClusterStateService#getAssertOnCommit. The trouble is that the state in question is deserializable (i.e. doesn't throw an exception) but not faithfully so.

elasticsearchmachine · 2025-02-24T19:06:28Z

Pinging @elastic/es-data-management (Team:Data Management)

elasticsearchmachine · 2025-02-24T22:27:09Z

💔 Backport failed

Status	Branch	Result
❌	8.18	Commit could not be cherrypicked due to conflicts
❌	8.x	Commit could not be cherrypicked due to conflicts
✅	9.0
❌	8.16	Commit could not be cherrypicked due to conflicts
❌	8.17	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 123079

joegallo · 2025-02-25T00:55:10Z

Manual backports are up.

joegallo added 2 commits February 20, 2025 16:14

Register IngestGeoIpMetadata as a NamedXContent

a1d2a66

joegallo added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team v8.18.1 v8.19.0 v9.0.1 v9.1.0 v8.16.5 v8.17.3 labels Feb 20, 2025

joegallo requested review from DaveCTurner and masseyke February 20, 2025 21:52

joegallo changed the title ~~Register IngestGeoipMetadata as a NamedXContent~~ Register IngestGeoIpMetadata as a NamedXContent Feb 20, 2025

Update docs/changelog/123079.yaml

90e47a9

joegallo added the auto-backport Automatically create backport pull requests when merged label Feb 21, 2025

joegallo added 5 commits February 24, 2025 13:29

Merge branch 'main' into register-geoip-metadata-as-named-xcontent

110a1c4

Rename full-cluster-restart to geoip-reindexed

ecdbd34

Add a upgrade test for geoip databases

abde3f7

Leaving a breadcrumb for the future

b7e0d98

Merge branch 'main' into register-geoip-metadata-as-named-xcontent

0972411

joegallo marked this pull request as ready for review February 24, 2025 19:06

masseyke approved these changes Feb 24, 2025

View reviewed changes

joegallo merged commit 6315b8a into elastic:main Feb 24, 2025
17 checks passed

joegallo deleted the register-geoip-metadata-as-named-xcontent branch February 24, 2025 22:25

joegallo mentioned this pull request Feb 24, 2025

[9.0] Register IngestGeoIpMetadata as a NamedXContent (#123079) #123316

Merged

elasticsearchmachine added the backport pending label Feb 24, 2025

joegallo added a commit to joegallo/elasticsearch that referenced this pull request Feb 24, 2025

Register IngestGeoIpMetadata as a NamedXContent (elastic#123079)

8e4d43e

elasticsearchmachine pushed a commit that referenced this pull request Feb 24, 2025

Register IngestGeoIpMetadata as a NamedXContent (#123079) (#123316)

dff78ec

joegallo added a commit to joegallo/elasticsearch that referenced this pull request Feb 25, 2025

Register IngestGeoIpMetadata as a NamedXContent (elastic#123079)

e92c351

joegallo added a commit to joegallo/elasticsearch that referenced this pull request Feb 25, 2025

Register IngestGeoIpMetadata as a NamedXContent (elastic#123079)

cce4120

joegallo added a commit to joegallo/elasticsearch that referenced this pull request Feb 25, 2025

Register IngestGeoIpMetadata as a NamedXContent (elastic#123079)

b2af3b6

joegallo added a commit to joegallo/elasticsearch that referenced this pull request Feb 25, 2025

Register IngestGeoIpMetadata as a NamedXContent (elastic#123079)

aa18c67

joegallo removed the backport pending label Feb 25, 2025

elasticsearchmachine pushed a commit that referenced this pull request Feb 25, 2025

Register IngestGeoIpMetadata as a NamedXContent (#123079) (#123329)

b8f8723

elasticsearchmachine pushed a commit that referenced this pull request Feb 25, 2025

Register IngestGeoIpMetadata as a NamedXContent (#123079) (#123327)

44ddc55

elasticsearchmachine pushed a commit that referenced this pull request Feb 25, 2025

Register IngestGeoIpMetadata as a NamedXContent (#123079) (#123328)

04085e2

elasticsearchmachine pushed a commit that referenced this pull request Feb 25, 2025

Register IngestGeoIpMetadata as a NamedXContent (#123079) (#123326)

8962a49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Register IngestGeoIpMetadata as a NamedXContent #123079

Register IngestGeoIpMetadata as a NamedXContent #123079

Uh oh!

joegallo commented Feb 20, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Feb 20, 2025

Uh oh!

DaveCTurner commented Feb 21, 2025

Uh oh!

DaveCTurner commented Feb 21, 2025

Uh oh!

DaveCTurner commented Feb 21, 2025

Uh oh!

joegallo commented Feb 21, 2025 •

edited

Loading

Uh oh!

DaveCTurner commented Feb 21, 2025

Uh oh!

elasticsearchmachine commented Feb 24, 2025

Uh oh!

Uh oh!

elasticsearchmachine commented Feb 24, 2025

Uh oh!

joegallo commented Feb 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Register IngestGeoIpMetadata as a NamedXContent #123079

Register IngestGeoIpMetadata as a NamedXContent #123079

Uh oh!

Conversation

joegallo commented Feb 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Feb 20, 2025

Uh oh!

DaveCTurner commented Feb 21, 2025

Uh oh!

DaveCTurner commented Feb 21, 2025

Uh oh!

DaveCTurner commented Feb 21, 2025

Uh oh!

joegallo commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DaveCTurner commented Feb 21, 2025

Uh oh!

elasticsearchmachine commented Feb 24, 2025

Uh oh!

Uh oh!

elasticsearchmachine commented Feb 24, 2025

💔 Backport failed

Uh oh!

joegallo commented Feb 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

joegallo commented Feb 20, 2025 •

edited

Loading

joegallo commented Feb 21, 2025 •

edited

Loading