Skip to content

Conversation

samxbr
Copy link
Contributor

@samxbr samxbr commented Sep 16, 2024

Allows setting index total_shards_per_node in the SearchableSnapshot action of ILM to remediate hot spot in shard allocation for searchable snapshot index.

Closes #112261

@samxbr samxbr changed the title Feature/add searchable snapshot setting ILM: Add total_shards_per_node setting to searchable snapshot Sep 16, 2024
@samxbr samxbr added >feature :Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement and removed >feature labels Sep 16, 2024
@samxbr samxbr marked this pull request as ready for review September 16, 2024 22:39
@samxbr samxbr requested a review from a team September 16, 2024 22:40
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Sep 16, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @samxbr, I've created a changelog YAML for you.

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Sam! I left some comments, but nothing major, this is pretty close!

"searchable_snapshot" : {
"snapshot_repository" : "backing_repo"
"snapshot_repository" : "backing_repo",
"total_shards_per_node" : 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a pretty power-user sort of feature, so I think it'd be best not to have it in the example (I worry about unthinking copy-and-paste)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense

this.forceMergeIndex = forceMergeIndex;

if (totalShardsPerNode != null && totalShardsPerNode < -1) {
throw new IllegalArgumentException("[" + TOTAL_SHARDS_PER_NODE.getPreferredName() + "] must be >= -1");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw new IllegalArgumentException("[" + TOTAL_SHARDS_PER_NODE.getPreferredName() + "] must be >= -1");
throw new IllegalArgumentException("[" + TOTAL_SHARDS_PER_NODE.getPreferredName() + "] must be >= 1");

this.snapshotRepository = snapshotRepository;
this.forceMergeIndex = forceMergeIndex;

if (totalShardsPerNode != null && totalShardsPerNode < -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (totalShardsPerNode != null && totalShardsPerNode < -1) {
if (totalShardsPerNode != null && totalShardsPerNode < 1) {

(since we want to disallow -1 and 0)

Collections.singletonMap(
SearchableSnapshotAction.NAME,
new SearchableSnapshotAction(randomAlphaOfLength(10), randomBoolean())
new SearchableSnapshotAction(randomAlphaOfLength(10), randomBoolean(), randomIntBetween(-1, 100))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 is not a valid setting (it would be like saying "you can allocate zero copies of the data to the node"), so I think this would be better as something like:

Suggested change
new SearchableSnapshotAction(randomAlphaOfLength(10), randomBoolean(), randomIntBetween(-1, 100))
new SearchableSnapshotAction(randomAlphaOfLength(10), randomBoolean(), (randomBoolean() ? null : randomIntBetween(1, 100)))

}

private Integer randomTotalShardsPerNode() {
return randomIntBetween(-1, 100);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest encapsulating the nullability (and fixing the bounds) with:

Suggested change
return randomIntBetween(-1, 100);
return randomBoolean() ? null : randomIntBetween(1, 100);

@samxbr
Copy link
Contributor Author

samxbr commented Sep 20, 2024

CI tests failing due to #113296

@samxbr
Copy link
Contributor Author

samxbr commented Sep 20, 2024

@elasticmachine update branch

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left one more comment, thanks Sam!

this.restoredIndexPrefix = restoredIndexPrefix;
this.storageType = Objects.requireNonNull(storageType, "a storage type must be specified");
if (totalShardsPerNode != null && totalShardsPerNode < 1) {
throw new IllegalArgumentException("[totalShardsPerNode] must be >= 1");
Copy link
Member

@dakrone dakrone Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should use the preferred name (and to match the config flag name), something like:

Suggested change
throw new IllegalArgumentException("[totalShardsPerNode] must be >= 1");
throw new IllegalArgumentException("[" + SearchableSnapshotAction.TOTAL_SHARDS_PER_NODE.getPreferredName() + "] must be >= 1");

@samxbr samxbr merged commit 80dd563 into elastic:main Sep 23, 2024
15 checks passed
@javanna
Copy link
Member

javanna commented Sep 24, 2024

Double checking: was this change supposed to go to main only? Or should it have been backported to 8.x?

@samxbr samxbr added v8.16.0 auto-backport Automatically create backport pull requests when merged labels Sep 24, 2024
@samxbr
Copy link
Contributor Author

samxbr commented Sep 24, 2024

Thanks @javanna for the reminder, I added backport for 8.16.

@javanna
Copy link
Member

javanna commented Sep 24, 2024

@samxbr I'm afraid you'll have to backport manually, because the PR has already been merged.

samxbr added a commit to samxbr/elasticsearch that referenced this pull request Sep 24, 2024
…c#112972)

Allows setting index total_shards_per_node in the SearchableSnapshot action of ILM to remediate hot spot in shard allocation for searchable snapshot index.

Closes elastic#112261
elasticsearchmachine pushed a commit that referenced this pull request Sep 24, 2024
… (#113493)

Allows setting index total_shards_per_node in the SearchableSnapshot action of ILM to remediate hot spot in shard allocation for searchable snapshot index.

Closes #112261
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Team:Data Management Meta label for data/management team v8.16.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support setting total_shards_per_node in the allocate action in the frozen tier of an ILM policy

5 participants