Skip to content

Conversation

@mhl-b
Copy link
Contributor

@mhl-b mhl-b commented Feb 12, 2025

Fixes #121869

testAcceptsMismatchedServerlessBuildHash uses wrong system property name "es.serverless" but should be "es.serverless_transport".

@mhl-b mhl-b added >test Issues or PRs that are addressing/adding tests Team:Distributed Coordination Meta label for Distributed Coordination team v9.1.0 labels Feb 12, 2025
@mhl-b mhl-b requested a review from thecoop February 12, 2025 04:25
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Distributed Coordination Meta label for Distributed Coordination team labels Feb 12, 2025
@mhl-b mhl-b added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. Team:Distributed Coordination Meta label for Distributed Coordination team and removed needs:triage Requires assignment of a team area label labels Feb 12, 2025
@elasticsearchmachine elasticsearchmachine added Team:Distributed Indexing Meta label for Distributed Indexing team and removed Team:Distributed Coordination Meta label for Distributed Coordination team labels Feb 12, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@mhl-b mhl-b added Team:Distributed Coordination Meta label for Distributed Coordination team :Distributed Coordination/Network Http and internode communication implementations and removed Team:Distributed Indexing Meta label for Distributed Indexing team :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. labels Feb 12, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@mhl-b
Copy link
Contributor Author

mhl-b commented Feb 12, 2025

Hey @thecoop, I'm wondering why system property is different, am I missing something. It was like this for 1.5 years.

@mhl-b mhl-b changed the title test-fix #121869 Test-fix testAcceptsMismatchedServerlessBuildHash #121869 Feb 12, 2025
@mhl-b mhl-b requested a review from nicktindall February 12, 2025 06:30
@thecoop
Copy link
Member

thecoop commented Feb 12, 2025

This might be something @DaveCTurner has been modifying...

@mhl-b
Copy link
Contributor Author

mhl-b commented Feb 12, 2025

I dont see "es.serverless" property in the code base, probably a mistake. This should be a valid fix.

Copy link
Contributor

@DiannaHohensee DiannaHohensee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me, but I had a concern about the reliability of the test before the change (see comment).

assumeTrue("Current build needs to be a snapshot", Build.current().isSnapshot());
assumeTrue("Security manager needs to be disabled", System.getSecurityManager() == null);
System.setProperty("es.serverless", Boolean.TRUE.toString()); // security manager blocks this
System.setProperty(TransportService.SERVERLESS_TRANSPORT_SYSTEM_PROPERTY, Boolean.TRUE.toString()); // security manager blocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the test reliably fail before this change? You mentioned reproducing it, but I wasn't sure of the frequency.

I'm wondering why this didn't fail in the first place when it was added. Maybe we could bundle the try-block logic into a method, and exercise it once with System.setProperty(TransportService.SERVERLESS_TRANSPORT_SYSTEM_PROPERTY, Boolean.TRUE.toString()); set to true (succeeds) and then a second time set to false (should throw)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the test reliably fail before this change?

yes, test fails every time with

./gradlew ":server:test" --tests "org.elasticsearch.transport.TransportServiceHandshakeTests.testAcceptsMismatchedServerlessBuildHash" -Dtests.seed=8278A46E39005C96 -Dtests.jvm.argline="-Des.entitlements.enabled=" -Dtests.locale=en-001 -Dtests.timezone=US/Pacific -Druntime.java=24

I believe test suppose to test this flag by going through debugger.

private void maybeThrowOnIncompatibleBuild(@Nullable DiscoveryNode node, @Nullable Exception e) {
if (SERVERLESS_TRANSPORT_FEATURE_FLAG == false && isIncompatibleBuild(version, buildHash)) {
throwOnIncompatibleBuild(node, e);
}
}

But it set with "es.serverless_transport".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could bundle the try-block

Do you mean put System.setProperty inside try-catch? I dont think System.setProperty throws here. Also if it throws entire test case will not run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, then it reliably fixes the test 👍 Thanks.

Optionally I'd suggest making the test more robust by also proving the test setup would normally throw, and then verifying that it does not throw with the serverless Property set -- e.g. put the try-block code in a method and call it twice, with and without the Property setting. But that's a test improvement, not a test fix, so I don't feel strongly about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious why it started to fail now since all related changes have been around more than 1 year. It turns out the reason is that the test was never executed because of the following line

assumeTrue("Security manager needs to be disabled", System.getSecurityManager() == null);

SecurityManager is removed in JDK 24. Hence it started to fail now with JDK24 while it is skipped when running with all prior JDK versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the test was "muted" for all this time? :)

Copy link
Member

@ywangd ywangd Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it definitely has Not ran since #95754

Copy link
Contributor

@DiannaHohensee DiannaHohensee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

assumeTrue("Current build needs to be a snapshot", Build.current().isSnapshot());
assumeTrue("Security manager needs to be disabled", System.getSecurityManager() == null);
System.setProperty("es.serverless", Boolean.TRUE.toString()); // security manager blocks this
System.setProperty(TransportService.SERVERLESS_TRANSPORT_SYSTEM_PROPERTY, Boolean.TRUE.toString()); // security manager blocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, then it reliably fixes the test 👍 Thanks.

Optionally I'd suggest making the test more robust by also proving the test setup would normally throw, and then verifying that it does not throw with the serverless Property set -- e.g. put the try-block code in a method and call it twice, with and without the Property setting. But that's a test improvement, not a test fix, so I don't feel strongly about it.

@mhl-b mhl-b merged commit 49ecbca into elastic:main Feb 13, 2025
17 checks passed
@mosche
Copy link
Contributor

mosche commented Feb 26, 2025

💚 All backports created successfully

Status Branch Result
8.x
9.0
8.18

Questions ?

Please refer to the Backport tool documentation

elasticsearchmachine pushed a commit that referenced this pull request Feb 26, 2025
…123477)

(cherry picked from commit 49ecbca)

Co-authored-by: Mikhail Berezovskiy <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Feb 26, 2025
…123478)

(cherry picked from commit 49ecbca)

Co-authored-by: Mikhail Berezovskiy <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Feb 26, 2025
…123476)

(cherry picked from commit 49ecbca)

Co-authored-by: Mikhail Berezovskiy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Network Http and internode communication implementations Team:Distributed Coordination Meta label for Distributed Coordination team >test Issues or PRs that are addressing/adding tests v8.18.1 v8.19.0 v9.0.1 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] TransportServiceHandshakeTests testAcceptsMismatchedServerlessBuildHash failing

6 participants