-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Account for reserved cluster state role mappings #113687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @n1v0lg, I've created a changelog YAML for you. |
…asticsearch into fix-cluster-state-role-mapping
| new ReservedRoleMappings(clusterStateRoleMapper) | ||
| ); | ||
| final ClusterStateRoleMapper clusterStateRoleMapper = new ClusterStateRoleMapper(settings, scriptService, clusterService); | ||
| final UserRoleMapper userRoleMapper = new CompositeRoleMapper(nativeRoleMappingStore, clusterStateRoleMapper); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change (getting rid of CompositeRoleMapper) is a little nuanced. The reason I'm making it is:
The PR changes the NativeRoleMappingStore to fetch "reserved" (i.e., cluster-state-based) role mappings and use these instead of any native role mappings with the same name (see mergeWithReserved). This means that we read cluster-state role mappings in the NativeRoleMappingStore, and substitute mappings based on what we read. If we stuck with the CompositeRoleMapper here, we'd "double-read" cluster-state-based role mappings.
An alternative I originally considered is to filter out native role mappings that match reserved ones and then simply rely on CompositeRoleMapper here to fill in cluster-state ones. However, this would introduce a race condition:
- When we get role mappings from the native store, we read cluster-state and filter out.
- We returning the result.
- Concurrently, cluster state changes (mappings added or removed).
- We read cluster state again, including the new mappings.
However, this means we have a "stale" view obtained in step 2.
By going with current approach in this PR instead (remove CompositeRoleMapper), we avoid this race condition since we only read cluster state once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tricky! I think your change here looks good. I thought of an alternative explained below.
I think it would be clearer to keep the role mapping stores separate and then leave the merging of them to some external class. Priorities can change and the way we want to merge them could change too. With your change ClusterStateRoleMapper is injected into the NativeRoleMappingStore through the wrapping class ReservedRoleMappings and you're using ClusterStateRoleMapper "internals" (getMappings - not exactly an internal but I think it was only public for testing purposes). It does violate the UserRoleMapper interface a little.
I think adding a new MergingCompositeRoleMapper (or a better name) that handles both read and write (so both getRoleMappings and putRoleMapping) would be more general. For read it would merge the result of the provided UserRoleMappers where priority could be the order of arguments passed (so ClusterStateRoleMapper would be passed first to have priority). For writes it could check the ClusterStateRoleMapper for reserved names before writing (according to the configuration) to the native store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've considered this (see a stub of it here) however it comes out a much bigger change, and I'm not sure it's worth it because the abstraction itself is sort of neither here nor there. The problems are:
The new class is neither Merging, nor a role mapper. Instead it's a class that merges on reads, but prevents reserved writes. So it's more of a ReservedRoleMappingWrappedStore. And it's not just a role mapper (that only handles resolving roles) but a role mapping store. The latter requires a whole new interface or abstract class, and plugging that interface in everywhere NativeRoleMappingStore is currently used. Do-able, but NativeRoleMappingStore is used in several actions and tests.
I'm just not sure that it's worth it to introduce all this generic structure to handle something that's unlikely to generalize. The change is already quite nuanced so I worry that making the delta bigger makes it harder to review and reason about. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I see what you mean. The interfaces are note exactly built for that and the change is big.
I'm just not sure that it's worth it to introduce all this generic structure to handle something that's unlikely to generalize. WDYT?
I think the coupling this introduces is a little concerning, but I don't think this is the right time to be making a huge refactor, so I think leaving it is fine.
| } else { | ||
| names = new HashSet<>(Arrays.asList(request.getNames())); | ||
| } | ||
| // TODO make sure result is deterministic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean here is ordered the same each time: the respond now contains the output of ClusterStateRoleMapper#getMappings which is a set and therefore not well-ordered.
jakelandis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good (only a quick review so far)
| * of the {@link NativeRoleMappingStore} and this mapper.</li> | ||
| * </ul> | ||
| */ | ||
| // TODO we need to register this setting as an escape hatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should expose this as setting if it is always true and will break things if we set it false. I think we rely (and should double check) the no-op,no-error behavior when the settings.json is not present. We should update the comments to ensure that it is know this is the ECK and serverless path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should expose this as setting if it is always true and will break things if we set it false.
I'm thinking of stateful, non-ECK deployments. The cluster-state role mapper still has some overhead in that it checks cluster state. If there is an issue we aren't thinking off, I'd really like there to be an escape hatch for stateful, non-ECK customers that by definition don't use cluster-state role mappings. I don't feel very strongly, but I think having a filtered, undocumented setting is a nice-to-have.
| this.scriptService = scriptService; | ||
| this.lastLoadCacheEnabled = LAST_LOAD_CACHE_ENABLED_SETTING.get(settings); | ||
| this.enabled = settings.getAsBoolean(NATIVE_ROLE_MAPPINGS_ENABLED, true); | ||
| this.reservedRoleMappings = reservedRoleMappings; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this since it simplifies things and nicely makes this change a no-op for serverless since IIUC this is completely disabled in serverless (and likley always will be)
| public void getRoleMappings(Set<String> names, ActionListener<List<ExpressionRoleMapping>> listener) { | ||
| innerGetRoleMappings( | ||
| names, | ||
| listener.delegateFailureAndWrap((l, mappings) -> l.onResponse(reservedRoleMappings.mergeWithReserved(mappings))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am getting abit lost to where ClusterStateRoleMapper#resolveRoles for reservedRoleMappings is called so that is garunteed to have the mappings already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClusterStateRoleMapper::resolveRoles is just wrapping ClusterStateRoleMapper::getMappings that looks at the current cluster state and resolves the roles from the current cluster state. ClusterStateRoleMapper::resolveRoles was previously called from the CompositeRoleMapper but with this change getMappings is instead called directly (so resolveRoles is skipped) as far as I can see.
| return clusterStateRoleMapper.getMappings().stream().anyMatch(roleMapping -> roleMapping.getName().equals(roleMappingName)); | ||
| } | ||
|
|
||
| // TODO find a cleaner way |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach would also be cleaner here.
|
Closing in favor of: #113900 |
Draft PR for ECK regression around moving operator-defined role mappings from the security index to cluster-state (ES-9628).
Some background first that will make the implementation choices below easier to follow:
The PR contains the following changes, bringing API behavior back as closely as possible to what it was before the regression:
Implementation notes:
cluster_state_role_mappings.enabledtotrue. Ideally, we would instead enable them for ECK. However, this would require coupling this fix with an ECK release; furthermore, in ECK, users can modify any and all settings. I've made the PR as much of a noop as possible for when cluster-state role mappings are not populated. Therefore, having them always on should be low impact on environments where they are not available.