Skip to content

Commit 651314e

Browse files
authored
Do not pass ProjectMetadata to lazy index permissions builder (#135337)
During a serverless incident (INC-4832) that was caused by frequent OOM exceptions it was discovered that ~30% of the heap was occupied by `ProjectMetadata` instances. The `ProjectMetadata` instances were retained by a lambda in `IndicesPermission`, see this example of a path to gc root: <img width="2760" height="915" alt="9a9f6dfd-bd11-41ac-a0e2-345a86ba0509" src="https://github.com/user-attachments/assets/7de8b33e-6002-4330-87e0-28a6ab7aeeac" /> The reason the lambda exists is to make the [index access control lazy](#88708). Because the lambda is lazy, it will hold on to the reference to `ProjectMetadata` for the full request life cycle (as opposed to building the index permissions and dropping the reference). This becomes a problem when there are many concurrent searches (index actions requiring us to check index permissions) coupled with frequent `ProjectMetadata` updates. Since the lambda holds a reference to `ProjectMetadata` it can't be garbage collected. I've proven this by: 1. Adding a sleep to `TransportSearchAction` to simulate slow searches 2. Hook up visual vm to Elasticsearch 3. Launch "slow" searches with `ProjectMetadata` updates in between (triggered by creating new indices) 4. Trigger GC manually through visual vm 5. Observe memory usage by `ProjectMetadata` while the searches are hanging (to simulate request in flight) ### Before any requests <img width="814" height="619" alt="Screenshot 2025-09-24 at 13 52 34" src="https://github.com/user-attachments/assets/5732a317-e298-4c90-bf8e-b5c211481c5d" /> ### While requests are in flight <img width="798" height="668" alt="Screenshot 2025-09-24 at 14 31 41" src="https://github.com/user-attachments/assets/c5e5e9ab-e95c-4a78-b0fd-f4bcc5d3149e" /> ### Fix To fix this issue I've moved the part that needed `ProjectMetadata` outside of the lambda. `ProjectMetadata` was needed to resolved failure store indices. With this PR we will do some more work that #88708 tried to remove, but I think it's acceptable for the memory gain. To validate that this fixed the issue I ran the same test as above and could see that `ProjectMetadata` could be garbaged collected as soon as authorization was finished. <img width="945" height="722" alt="Screenshot 2025-09-24 at 14 04 37" src="https://github.com/user-attachments/assets/8f70b388-2150-4768-a5bf-ac10dd36b41c" />
1 parent 9316d64 commit 651314e

File tree

2 files changed

+21
-5
lines changed

2 files changed

+21
-5
lines changed

docs/changelog/135337.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 135337
2+
summary: Do not pass `ProjectMetadata` to lazy index permissions builder
3+
area: Security
4+
type: enhancement
5+
issues: []

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/permission/IndicesPermission.java

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -461,6 +461,12 @@ private IndexResource(String name, @Nullable IndexAbstraction abstraction, @Null
461461
this.selector = selector;
462462
}
463463

464+
public List<Index> getFailureIndices(ProjectMetadata metadata) {
465+
return indexAbstraction != null && IndexComponentSelector.FAILURES.equals(selector)
466+
? indexAbstraction.getFailureIndices(metadata)
467+
: List.of();
468+
}
469+
464470
/**
465471
* @return {@code true} if-and-only-if this object is related to a data-stream, either by having a
466472
* {@link IndexAbstraction#getType()} of {@link IndexAbstraction.Type#DATA_STREAM} or by being the backing index for a
@@ -535,13 +541,12 @@ public int size(Map<String, IndexAbstraction> lookup) {
535541
}
536542
}
537543

538-
public Collection<String> resolveConcreteIndices(ProjectMetadata metadata) {
544+
public Collection<String> resolveConcreteIndices(List<Index> failureIndices) {
539545
if (indexAbstraction == null) {
540546
return List.of();
541547
} else if (indexAbstraction.getType() == IndexAbstraction.Type.CONCRETE_INDEX) {
542548
return List.of(indexAbstraction.getName());
543549
} else if (IndexComponentSelector.FAILURES.equals(selector)) {
544-
final List<Index> failureIndices = indexAbstraction.getFailureIndices(metadata);
545550
final List<String> concreteIndexNames = new ArrayList<>(failureIndices.size());
546551
for (var idx : failureIndices) {
547552
concreteIndexNames.add(idx.getName());
@@ -604,12 +609,16 @@ public IndicesAccessControl authorize(
604609

605610
final boolean overallGranted = isActionGranted(action, resources.values());
606611
final int finalTotalResourceCount = totalResourceCount;
612+
final var failureIndicesByResourceName = resources.entrySet()
613+
.stream()
614+
.collect(Collectors.toMap(Map.Entry::getKey, entry -> entry.getValue().getFailureIndices(metadata)));
615+
607616
final Supplier<Map<String, IndicesAccessControl.IndexAccessControl>> indexPermissions = () -> buildIndicesAccessControl(
608617
action,
609618
resources,
610619
finalTotalResourceCount,
611620
fieldPermissionsCache,
612-
metadata
621+
failureIndicesByResourceName
613622
);
614623

615624
return new IndicesAccessControl(overallGranted, indexPermissions);
@@ -620,7 +629,7 @@ private Map<String, IndicesAccessControl.IndexAccessControl> buildIndicesAccessC
620629
final Map<String, IndexResource> requestedResources,
621630
final int totalResourceCount,
622631
final FieldPermissionsCache fieldPermissionsCache,
623-
final ProjectMetadata metadata
632+
final Map<String, List<Index>> failureIndicesByIndexResource
624633
) {
625634

626635
// now... every index that is associated with the request, must be granted
@@ -636,7 +645,9 @@ private Map<String, IndicesAccessControl.IndexAccessControl> buildIndicesAccessC
636645
boolean granted = false;
637646
final String resourceName = resourceEntry.getKey();
638647
final IndexResource resource = resourceEntry.getValue();
639-
final Collection<String> concreteIndices = resource.resolveConcreteIndices(metadata);
648+
final Collection<String> concreteIndices = resource.resolveConcreteIndices(
649+
failureIndicesByIndexResource.get(resourceEntry.getKey())
650+
);
640651
for (Group group : groups) {
641652
// the group covers the given index OR the given index is a backing index and the group covers the parent data stream
642653
if (resource.checkIndex(group)) {

0 commit comments

Comments
 (0)