Skip to content

Conversation

benwtrent
Copy link
Member

There is a JDK issue where closing sharedArenas from many threads can significantly harm performance.

This ref-counting of shared arenas was designed as a way to get around this performance issue. However, we have noticed a significant increase in leaks and issues with mmap regions since this change.

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the performance impact of closing shared arenas (though possibly not fully mitigated it).

I am proposing we turn off the grouping as it appears (at least to me), not worth it.

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2

@benwtrent benwtrent added >bug :Core/Infra/Core Core issues without another label :Search/Search Search-related issues that do not fall into other categories :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. v9.2.0 v8.19.5 v9.1.5 labels Sep 18, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @benwtrent, I've created a changelog YAML for you.

@benwtrent benwtrent marked this pull request as ready for review September 18, 2025 15:58
@benwtrent benwtrent requested a review from a team as a code owner September 18, 2025 15:58
@elasticsearchmachine elasticsearchmachine added Team:Core/Infra Meta label for core/infra team Team:Search Meta label for search team Team:Distributed Indexing Meta label for Distributed Indexing team labels Sep 18, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@lkts
Copy link
Contributor

lkts commented Sep 18, 2025

I don't have anything to say about the code but i agree with the idea to disable this logic given the amount of issues we've had.

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benwtrent and I chatted abut this offline, and the changes look good to me.

@benwtrent benwtrent added auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) auto-backport Automatically create backport pull requests when merged labels Sep 19, 2025
@elasticsearchmachine elasticsearchmachine merged commit 2672cd0 into elastic:main Sep 19, 2025
34 checks passed
@benwtrent benwtrent deleted the bugfix/bypass-mmmap-arena-grouping branch September 19, 2025 22:52
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts
9.1

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 135012

benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Sep 19, 2025
…gions being mapped (elastic#135012)

There is a JDK issue where closing sharedArenas from many threads can
significantly harm performance. 

This ref-counting of shared arenas was designed as a way to get around
this performance issue. However, we have noticed a significant increase
in leaks and issues with mmap regions since this change. 

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the
performance impact of closing shared arenas (though possibly not fully
mitigated it).

I am proposing we turn off the grouping as it appears (at least to me),
not worth it. 

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2
elasticsearchmachine pushed a commit that referenced this pull request Sep 19, 2025
…gions being mapped (#135012) (#135115)

There is a JDK issue where closing sharedArenas from many threads can
significantly harm performance. 

This ref-counting of shared arenas was designed as a way to get around
this performance issue. However, we have noticed a significant increase
in leaks and issues with mmap regions since this change. 

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the
performance impact of closing shared arenas (though possibly not fully
mitigated it).

I am proposing we turn off the grouping as it appears (at least to me),
not worth it. 

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2
@benwtrent
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

@benwtrent
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
9.0
8.18

Questions ?

Please refer to the Backport tool documentation

benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Sep 20, 2025
…gions being mapped (elastic#135012)

There is a JDK issue where closing sharedArenas from many threads can
significantly harm performance.

This ref-counting of shared arenas was designed as a way to get around
this performance issue. However, we have noticed a significant increase
in leaks and issues with mmap regions since this change.

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the
performance impact of closing shared arenas (though possibly not fully
mitigated it).

I am proposing we turn off the grouping as it appears (at least to me),
not worth it.

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2

(cherry picked from commit 2672cd0)
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Sep 20, 2025
…gions being mapped (elastic#135012)

There is a JDK issue where closing sharedArenas from many threads can
significantly harm performance.

This ref-counting of shared arenas was designed as a way to get around
this performance issue. However, we have noticed a significant increase
in leaks and issues with mmap regions since this change.

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the
performance impact of closing shared arenas (though possibly not fully
mitigated it).

I am proposing we turn off the grouping as it appears (at least to me),
not worth it.

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2

(cherry picked from commit 2672cd0)
elasticsearchmachine pushed a commit that referenced this pull request Sep 20, 2025
…any regions being mapped (#135012) (#135131)

* Bypass MMap arena grouping as this has caused issues with too many regions being mapped (#135012)

There is a JDK issue where closing sharedArenas from many threads can
significantly harm performance.

This ref-counting of shared arenas was designed as a way to get around
this performance issue. However, we have noticed a significant increase
in leaks and issues with mmap regions since this change.

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the
performance impact of closing shared arenas (though possibly not fully
mitigated it).

I am proposing we turn off the grouping as it appears (at least to me),
not worth it.

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2

(cherry picked from commit 2672cd0)

* [CI] Auto commit changes from spotless

* fixing format

* fixing compilation

---------

Co-authored-by: elasticsearchmachine <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Sep 20, 2025
…many regions being mapped (#135012) (#135133)

* Bypass MMap arena grouping as this has caused issues with too many regions being mapped (#135012)

There is a JDK issue where closing sharedArenas from many threads can
significantly harm performance.

This ref-counting of shared arenas was designed as a way to get around
this performance issue. However, we have noticed a significant increase
in leaks and issues with mmap regions since this change.

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the
performance impact of closing shared arenas (though possibly not fully
mitigated it).

I am proposing we turn off the grouping as it appears (at least to me),
not worth it.

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2

(cherry picked from commit 2672cd0)

* [CI] Auto commit changes from spotless

* fixing compilation

---------

Co-authored-by: elasticsearchmachine <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Sep 20, 2025
…many regions being mapped (#135012) (#135130)

* Bypass MMap arena grouping as this has caused issues with too many regions being mapped (#135012)

There is a JDK issue where closing sharedArenas from many threads can
significantly harm performance.

This ref-counting of shared arenas was designed as a way to get around
this performance issue. However, we have noticed a significant increase
in leaks and issues with mmap regions since this change.

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the
performance impact of closing shared arenas (though possibly not fully
mitigated it).

I am proposing we turn off the grouping as it appears (at least to me),
not worth it.

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2

(cherry picked from commit 2672cd0)

* [CI] Auto commit changes from spotless

* fixing compilation

---------

Co-authored-by: elasticsearchmachine <[email protected]>
gmjehovich pushed a commit to gmjehovich/elasticsearch that referenced this pull request Sep 22, 2025
…gions being mapped (elastic#135012)

There is a JDK issue where closing sharedArenas from many threads can
significantly harm performance. 

This ref-counting of shared arenas was designed as a way to get around
this performance issue. However, we have noticed a significant increase
in leaks and issues with mmap regions since this change. 

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the
performance impact of closing shared arenas (though possibly not fully
mitigated it).

I am proposing we turn off the grouping as it appears (at least to me),
not worth it. 

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2
DonalEvans pushed a commit to DonalEvans/elasticsearch that referenced this pull request Sep 22, 2025
…gions being mapped (elastic#135012)

There is a JDK issue where closing sharedArenas from many threads can
significantly harm performance. 

This ref-counting of shared arenas was designed as a way to get around
this performance issue. However, we have noticed a significant increase
in leaks and issues with mmap regions since this change. 

https://bugs.openjdk.org/browse/JDK-8335480 should have helped the
performance impact of closing shared arenas (though possibly not fully
mitigated it).

I am proposing we turn off the grouping as it appears (at least to me),
not worth it. 

I am willing to backdown if we thing other fixes should be done.

I also suggest this gets backported to 9.1, 8.19, and is merged into 9.2
@javanna
Copy link
Member

javanna commented Sep 23, 2025

++ from me as well, thanks for opening this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug :Core/Infra/Core Core issues without another label :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. :Search/Search Search-related issues that do not fall into other categories Team:Core/Infra Meta label for core/infra team Team:Distributed Indexing Meta label for Distributed Indexing team Team:Search Meta label for search team v8.18.8 v8.19.5 v9.0.8 v9.1.5 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants