Skip to content

SOLR-16458: Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS#4171

Open
epugh wants to merge 21 commits intoapache:mainfrom
epugh:copilot/migrate-node-health-api
Open

SOLR-16458: Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS#4171
epugh wants to merge 21 commits intoapache:mainfrom
epugh:copilot/migrate-node-health-api

Conversation

@epugh
Copy link
Contributor

@epugh epugh commented Feb 28, 2026

Migrates NodeHealthAPI — the last node-level V2 API still using Solr's homegrown @EndPoint annotation — to standard JAX-RS, following the same pattern as NodeLogging, GetPublicKey, etc.

Design

The logic stays in HealthCheckHandler (minimising diff surface). NodeHealthAPI is a thin JAX-RS wrapper (~60 lines) that delegates entirely to it.

Key changes

  • solr/api — New NodeHealthApi interface (@Path, @GET, @Operation) and NodeHealthResponse model (status, message, num_cores_unhealthy)
  • NodeHealthAPI — Replaces @EndPoint with JAX-RS; injects CoreContainer, delegates to HealthCheckHandler
  • HealthCheckHandler — Logic unchanged; adds public NodeHealthResponse checkNodeHealth(Boolean, Integer) as the shared entry point for both v1 (handleRequestBody) and v2 (NodeHealthAPI); switches to getJerseyResources() / empty getApis()
  • V2NodeAPIMappingTest — Removes the now-obsolete @EndPoint/ApiBag routing test for health
  • NodeHealthAPITest — New Mockito unit tests for the API class
  • NodeHealthAPITest2 — New mock-free integration tests: cloud-mode via real MiniSolrCloudCluster, legacy mode via embedded CoreContainer built from NodeConfig
  • implicit-requesthandlers.adoc — Health section now links to both HealthCheckHandler (v1) and NodeHealthAPI (v2) javadocs

Copilot AI and others added 4 commits February 22, 2026 12:52
Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…ef guide

Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
@github-actions github-actions bot added documentation Improvements or additions to documentation tests cat:api labels Feb 28, 2026
@epugh epugh marked this pull request as ready for review February 28, 2026 13:20
@epugh epugh requested review from gerlowskija and janhoy February 28, 2026 13:20
@epugh
Copy link
Contributor Author

epugh commented Feb 28, 2026

Some outstanding questions: 1) there is a more mock and less mock versions of the same test. Which do we prefer? 2) Does this seem like a reasonable pattern for the conversion?

@epugh
Copy link
Contributor Author

epugh commented Feb 28, 2026

I wish I didnt' have TWO ways of writing tests, one for cloud and one for standalone... sigh.

@epugh
Copy link
Contributor Author

epugh commented Mar 1, 2026

tests all pass!

@epugh epugh changed the title Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS; add ref guide link Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS Mar 2, 2026
@epugh epugh removed the no-changelog label Mar 3, 2026
@gerlowskija
Copy link
Contributor

@epugh - this should probably be attached to one of the v2 JIRA tickets or another. Maybe https://issues.apache.org/jira/browse/SOLR-16458?

@epugh
Copy link
Contributor Author

epugh commented Mar 3, 2026

@epugh - this should probably be attached to one of the v2 JIRA tickets or another. Maybe https://issues.apache.org/jira/browse/SOLR-16458?

Sure... I suppose I could be crosslinking all of these to various JIRAs?

@epugh epugh changed the title Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS SOLR-16458: Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS Mar 3, 2026
* mocks.
*
* <p>Cloud-mode tests use a real {@link org.apache.solr.cloud.MiniSolrCloudCluster} and get a
* {@link CoreContainer} directly from a {@link JettySolrRunner}. Legacy (standalone) mode tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Do we have other tests that do standalone testing within a SolrCloudTestCase?

It feels weird conceptually. And in practical terms SolrCloudTestCase does some work that makes it much slower on a per-test basis than our other base classes. Doing standalone testing in a SolrCloudTestCase is going to end up paying that runtime cost for no reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno... Do you think we should ahve TWO tests? The setup is only done one in beforeClass.. not per test.

public void testCloudMode_RequireHealthyCoresReturnOkWhenAllCoresHealthy() {
CoreContainer coreContainer = cluster.getJettySolrRunner(0).getCoreContainer();

// requireHealthyCores=true should succeed on a node with no unhealthy cores
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-1] You haven't actually created any cores!!

Can you create a collection or something that'll cause this test to actually exercise the per-core logic currently in HealthcheckHandler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added something, and one thing is we may want to collapse our tests from HealthCheckHandlerTest at some point into this test case.

v2: `api/node/health` |{solr-javadocs}/core/org/apache/solr/handler/admin/HealthCheckHandler.html[HealthCheckHandler] |
v2: `api/node/health` |v1: {solr-javadocs}/core/org/apache/solr/handler/admin/HealthCheckHandler.html[HealthCheckHandler]

v2: {solr-javadocs}/core/org/apache/solr/handler/admin/api/NodeHealthAPI.html[NodeHealthAPI] |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Is it the implementation Javadocs we want to point people to here, or would the solr/api interface docs be more helpful?

Or more broadly - is there much value even in pointing to either Javadoc on the v2 side? HealthcheckHandler has a nice good blurb, but neither NodeHealthAPI nor NodeHealthApi have much of anything that's worth pointing a user at IMO...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.... I think we need to decide what the pattern should be for these new apis... Where DO we want these apis to link to and be documented? I don't have a strong opinion, and honestlhy, if we punted this to a future JIRA I could live with that too.

@dsmiley
Copy link
Contributor

dsmiley commented Mar 3, 2026

Just want to give kudos to Jason's thoroughly amazing code review. Really shows how important it is that we have the right reviewers on the right issues. Hopefully after this issue is done, similar issues can follow the same lessons/advise. Assuming LLM is doing the bulk of the work, it can be pointed specifically at this PR to learn the key aspects to successfully do similar transitions.

@epugh
Copy link
Contributor Author

epugh commented Mar 7, 2026

@gerlowskija all really good comments and also highlights how little I know about the capabilities available in Jax rs and our implementation!!! Will update on Monday.

@epugh
Copy link
Contributor Author

epugh commented Mar 9, 2026

Okay, at this point down to a decision on Enum impact, migrating all the business logic out of HealthCheckHandler, and Javadocs and then this may be good to go!

@epugh
Copy link
Contributor Author

epugh commented Mar 9, 2026

Okay, at this point down to a decision on Enum impact, migrating all the business logic out of HealthCheckHandler, and Javadocs and then this may be good to go!

Chatted with @gerlowskija on phone, and we're going to take a stab at migrating business logic. Javadocs we won't do anything more creative than what we have for now. Forgot to ask about Enum.

Copilot AI and others added 2 commits March 9, 2026 19:24
…kHandler delegates to V2

- NodeHealthAPI now owns all business logic (cloud mode, legacy mode,
  isWithinGenerationLag, findUnhealthyCores) using strongly-typed
  NodeHealthResponse / NodeStatus throughout.
- HealthCheckHandler becomes a thin V1 bridge: handleRequestBody()
  creates NodeHealthAPI(coreContainer).checkNodeHealth(...) and
  squashes the typed response into SolrQueryResponse.
- findUnhealthyCores() moved to NodeHealthAPI as a public static util;
  HealthCheckHandler keeps a @deprecated delegation shim so existing
  callers continue to compile.
- HealthCheckHandlerTest updated to call NodeHealthAPI.findUnhealthyCores()
  directly.
- Utils.getReflectWriter() now serialises Enum values as their .name()
  string so that NodeStatus.OK round-trips as "OK" through
  NamedList/javabin, keeping HealthCheckHandlerTest assertions passing.
- Fixed pre-existing bug in isWithinGenerationLag: condition was
  `generationDiff < maxGenerationLag` (wrong); corrected to
  `generationDiff > maxGenerationLag` with the return values adjusted
  so the method returns true=healthy / false=lagging-too-far.
- Fixed missing slf4j log arguments in the negative-diff warning.

Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
@epugh
Copy link
Contributor Author

epugh commented Mar 10, 2026

Okay, at this point down to a decision on Enum impact, migrating all the business logic out of HealthCheckHandler, and Javadocs and then this may be good to go!

Chatted with @gerlowskija on phone, and we're going to take a stab at migrating business logic. Javadocs we won't do anything more creative than what we have for now. Forgot to ask about Enum.

Got a fix for handling the Enum comparisoin. I migrated the business logic, and removed a duplicate test in the process...

@gerlowskija I think this is ready for final review!

@epugh
Copy link
Contributor Author

epugh commented Mar 10, 2026

Oh, and the reason, based on spelunking, why we don't support maxGenerationLag in v2 is that it's a feature of the old leader/follower capability only. See TestHealthCheckHandlerLegacyMode.

epugh added 2 commits March 11, 2026 07:07
I used claude for this regressoin test and I don't love how verbose they are.  I tried a mock approach first and it was worse.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:api client:solrj documentation Improvements or additions to documentation tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants