Commit 651314e
authored
Do not pass ProjectMetadata to lazy index permissions builder (#135337)
During a serverless incident (INC-4832) that was caused by frequent OOM
exceptions it was discovered that ~30% of the heap was occupied by
`ProjectMetadata` instances.
The `ProjectMetadata` instances were retained by a lambda in
`IndicesPermission`, see this example of a path to gc root: <img
width="2760" height="915" alt="9a9f6dfd-bd11-41ac-a0e2-345a86ba0509"
src="https://github.com/user-attachments/assets/7de8b33e-6002-4330-87e0-28a6ab7aeeac"
/>
The reason the lambda exists is to make the [index access control
lazy](#88708). Because the
lambda is lazy, it will hold on to the reference to `ProjectMetadata`
for the full request life cycle (as opposed to building the index
permissions and dropping the reference). This becomes a problem when
there are many concurrent searches (index actions requiring us to check
index permissions) coupled with frequent `ProjectMetadata` updates.
Since the lambda holds a reference to `ProjectMetadata` it can't be
garbage collected.
I've proven this by: 1. Adding a sleep to `TransportSearchAction` to
simulate slow searches 2. Hook up visual vm to Elasticsearch 3. Launch
"slow" searches with `ProjectMetadata` updates in between (triggered by
creating new indices) 4. Trigger GC manually through visual vm 5.
Observe memory usage by `ProjectMetadata` while the searches are hanging
(to simulate request in flight)
### Before any requests
<img width="814" height="619" alt="Screenshot 2025-09-24 at 13 52 34"
src="https://github.com/user-attachments/assets/5732a317-e298-4c90-bf8e-b5c211481c5d"
/>
### While requests are in flight <img width="798" height="668"
alt="Screenshot 2025-09-24 at 14 31 41"
src="https://github.com/user-attachments/assets/c5e5e9ab-e95c-4a78-b0fd-f4bcc5d3149e"
/>
### Fix To fix this issue I've moved the part that needed
`ProjectMetadata` outside of the lambda. `ProjectMetadata` was needed
to resolved failure store indices. With this PR we will do some more
work that #88708 tried to
remove, but I think it's acceptable for the memory gain.
To validate that this fixed the issue I ran the same test as above and
could see that `ProjectMetadata` could be garbaged collected as soon as
authorization was finished.
<img width="945" height="722" alt="Screenshot 2025-09-24 at 14 04 37"
src="https://github.com/user-attachments/assets/8f70b388-2150-4768-a5bf-ac10dd36b41c"
/>1 parent 9316d64 commit 651314e
File tree
2 files changed
+21
-5
lines changed- docs/changelog
- x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/permission
2 files changed
+21
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
Lines changed: 16 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
461 | 461 | | |
462 | 462 | | |
463 | 463 | | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
464 | 470 | | |
465 | 471 | | |
466 | 472 | | |
| |||
535 | 541 | | |
536 | 542 | | |
537 | 543 | | |
538 | | - | |
| 544 | + | |
539 | 545 | | |
540 | 546 | | |
541 | 547 | | |
542 | 548 | | |
543 | 549 | | |
544 | | - | |
545 | 550 | | |
546 | 551 | | |
547 | 552 | | |
| |||
604 | 609 | | |
605 | 610 | | |
606 | 611 | | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
607 | 616 | | |
608 | 617 | | |
609 | 618 | | |
610 | 619 | | |
611 | 620 | | |
612 | | - | |
| 621 | + | |
613 | 622 | | |
614 | 623 | | |
615 | 624 | | |
| |||
620 | 629 | | |
621 | 630 | | |
622 | 631 | | |
623 | | - | |
| 632 | + | |
624 | 633 | | |
625 | 634 | | |
626 | 635 | | |
| |||
636 | 645 | | |
637 | 646 | | |
638 | 647 | | |
639 | | - | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
640 | 651 | | |
641 | 652 | | |
642 | 653 | | |
| |||
0 commit comments