Skip to content

Conversation

@JeremyDahlgren
Copy link
Contributor

Relaxes a log expectation assertion from an exact test of numShards running snapshots to 1-numShards, since it
is possible for some of the shard snapshot statuses to already be in stage=PAUSED.

Closes #127690

Relaxes a log expectation assertion from an exact test
of numShards running snapshots to 1-numShards, since it
is possible for some of the shard snapshot statuses to
already be in stage=PAUSED.

Closes elastic#127690
@JeremyDahlgren JeremyDahlgren added >test Issues or PRs that are addressing/adding tests :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Coordination Meta label for Distributed Coordination team v9.1.0 labels May 9, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

mockLog.awaitAllExpectationsMatched();
resetMockLog();

assert 1 <= numShards && numShards <= 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert 1 <= numShards && numShards <= 10;
// At least one shard reached the MockRepository's blocking code when waitForBlock was called. However, there's no guarantee that
// the other shards got that far before the shutdown flag was put in place, in which case the other shards may be paused instead.
assert 1 <= numShards && numShards <= 10;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why we are asserting this because final int numShards = randomIntBetween(1, 10); already says it's going to be between 1 and 10?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the code comment. In the chance the thread bounds were changed it seemed clearer to enforce the restriction there instead of waiting for the seen event timeout. I can see it doesn't add value, the assertion is removed now.

SnapshotShutdownProgressTracker.class.getCanonicalName(),
Level.INFO,
"*Number shard snapshots running [" + numShards + "].*"
".+Number shard snapshots running \\[" + (numShards < 10 ? "[1-" + numShards + "]" : "([1-9]|10)") + "].+"
Copy link
Contributor

@DiannaHohensee DiannaHohensee May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change the final int numShards = randomIntBetween(1, 10); to final int numShards = randomIntBetween(1, 9); to make this easier? The random number range is arbitrary, so I think that would be okay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ (or else maybe leave this as a SeenEventExpectation and parse out the numShards by overriding innerMatch rather than trying to do anything overly clever with a regex)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JeremyDahlgren JeremyDahlgren merged commit d07ec0c into elastic:main May 13, 2025
17 checks passed
richard-dennehy pushed a commit to richard-dennehy/elasticsearch that referenced this pull request May 19, 2025
…er() (elastic#127998)

Relaxes a log expectation assertion from an exact test
of numShards running snapshots to 1-numShards, since it
is possible for some of the shard snapshot statuses to
already be in stage=PAUSED.

Closes elastic#127690
benchaplin pushed a commit to benchaplin/elasticsearch that referenced this pull request May 20, 2025
…er() (elastic#127998)

Relaxes a log expectation assertion from an exact test
of numShards running snapshots to 1-numShards, since it
is possible for some of the shard snapshot statuses to
already be in stage=PAUSED.

Closes elastic#127690
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Coordination Meta label for Distributed Coordination team >test Issues or PRs that are addressing/adding tests v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] SnapshotShutdownIT testSnapshotShutdownProgressTracker failing

4 participants