-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add a monitor_stats privilege and allow that privilege for remote cluster privileges #114964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pinging @elastic/es-security (Team:Security) |
Re the CI failure: we also need to update the failing privilege registry test in the serverless repo, by adding the new priv there. This is currently unnecessarily a two-step process. I have a v old PR open to make it a one step process instead (#110178). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I like the BWC-handling this introduces. I have a high level question before proceeding: for RCS 1.0, the new privilege will be included in the cluster
section, not the remote_cluster
section so we need to handle BWC there too, otherwise CCS becomes unusable with API keys with that cluster privilege (see my inline comment here). Let me know if I misunderstood s.th. though!
...ck/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authc/Authentication.java
Show resolved
Hide resolved
roleName, | ||
currentFieldName | ||
); | ||
logger.warn(message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only concern is that it's a warn log that's triggered by client-side errors -- if we ever monitor for WARN log rates, a user with a loop and a misconception of our APIs can get noisy, without any recourse for us. I'd go with INFO here tbh
* and {@link Role}. This model is not intended to be sent to a remote cluster, but can be (wire) serialized within a single cluster | ||
* as well as the Xcontent serialization for the REST API and persistence of the role in the security index. The privileges modeled here | ||
* will be converted to the appropriate cluster privileges when sent to a remote cluster. | ||
* and {@link Role}. This model is not intended to support RCS 2.0 where these remote cluster permissions are converted to local cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this bit "This model is not intended to support RCS 2.0" is quite confusing -- the sole reason remote_cluster
permissions exist is to support RCS 2.0 (for RCS 1.0 it's just cluster
permissions). I think the Javadoc before this change was clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's just a case of an extra "not" though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah.. i must have written that pretty fast. will update.
...va/org/elasticsearch/xpack/core/security/authz/store/KibanaOwnedReservedRoleDescriptors.java
Show resolved
Hide resolved
...va/org/elasticsearch/xpack/core/security/authz/store/KibanaOwnedReservedRoleDescriptors.java
Show resolved
Hide resolved
...va/org/elasticsearch/xpack/core/security/authz/store/KibanaOwnedReservedRoleDescriptors.java
Show resolved
Hide resolved
// Update remote cluster settings on QC. | ||
setupQueryClusterRemoteClusters(useProxyMode); | ||
// Ensure remote cluster is connected | ||
if (isRCS2()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested this but you should be able to replace this whole block with configureRemoteCluster(REMOTE_CLUSTER_ALIAS, fulfillingCluster, isRCS2(), randomBoolean(), false); super.setUp();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beyond that, I'd actually leave this method abstract and just let the concrete implementations invoke the method with the right parameters instead of using isRCS2()
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, that cleaned it up a bit, but I needed to keep isRCS2()
to conditionally add or not add the mirror'ed role for RCS 1.0 to the fulfilling cluster.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - few suggestions and smaller test fixes but nothing that requires another review 👍
As follow ups, we'll also want to add the new privilege to https://github.com/elastic/elasticsearch-specification/blob/main/specification/security/_types/Privileges.ts#L43
and mark as non-serverless-supported.
} | ||
} else if (authentication.getEffectiveSubject() | ||
.getTransportVersion() | ||
.onOrAfter(TRANSPORT_VERSION_ADVANCED_REMOTE_CLUSTER_SECURITY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think TRANSPORT_VERSION_ADVANCED_REMOTE_CLUSTER_SECURITY
should be ROLE_REMOTE_CLUSTER_PRIVS
-- otherwise, we'll include versions for which there is no way that the authentication subject would ever have a remote_cluster
field to begin with.
Also, I think a comment here explaining that maybeRemoveRemoteClusterPrivilegesFromRoleDescriptors
handles more specific transport version checks would help here I think. It breaks with the pattern of
if (authentication.onOrAfter(version) && streamVersions.before(version)) { do something }
and looks a bit like a copy-paste bug at first glance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think TRANSPORT_VERSION_ADVANCED_REMOTE_CLUSTER_SECURITY should be ROLE_REMOTE_CLUSTER_PRIVS -- otherwise, we'll include versions for which there is no way that the authentication subject would ever have a remote_cluster field to begin with.
I'll change it, but don't think it matters. IIUC this specific check is really on needed for being able to know, in the local cluster, for any persisted authentication (e.g. watcher, ilm) done with an API Key, if we are working with a byte[] representation or the older format for persisted Authentication objects. So I believe that any value after VERSION_API_KEY_ROLES_AS_BYTES would work fine. I don't have specific test for this, but since we convert byte[] to Map, this check should only be concerned with VERSION_API_KEY_ROLES_AS_BYTES in this cluster. The other check && streamVersion.onOrAfter(ROLE_REMOTE_CLUSTER_PRIVS)) {
will protect against any problems with the remote understanding remote_cluster.
For example, if the persisted authentication had the transport version before ROLE_REMOTE_CLUSTER_PRIVS (and sending to a new cluster is new enough), then execution will fall into the else condition but the persisted authentication object would never have a remote_cluster entry and effectively do nothing (OK it does a bit of necessary work). It won't blow up since we are just manipulating json/byte[] and maps. I agree that it confusing as-is and using ROLE_REMOTE_CLUSTER_PRIVS probably helps a bit with readability (and avoid some uncessary work). Long term, we should look for a way to stop explicitly rewriting these things and make it a first class attribute of model that understands authentication and transport version concerns. Fundamentally all of the checks in this method does the same thing...strip out things that streams of specific versions won't understand and IMO it plausible to have a single stragegy (for both RCS and node-to-node in same cluster).
(also updated comments to hopefull be bit more clear)
private final List<RemoteClusterPermissionGroup> remoteClusterPermissionGroups; | ||
|
||
// package private non-final for testing | ||
static Map<TransportVersion, Set<String>> allowedRemoteClusterPermissions = Map.of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an optional refactoring suggestion (feel free to reject):
I think it's cleaner and more natural to maintain a map of privilege to version it was introduced, so:
private static Map<String, TransportVersion> privilegeToSupportedVersion = Map.ofEntries(
Map.entry(ClusterPrivilegeResolver.MONITOR_ENRICH.name(), ROLE_REMOTE_CLUSTER_PRIVS),
Map.entry(ClusterPrivilegeResolver.MONITOR_STATS.name(), ROLE_MONITOR_STATS)
);
Then, removeUnsupportedPrivileges
becomes:
...
for (String privilege : privileges) {
assert privilegeToSupportedVersion.containsKey(privilege.toLowerCase(Locale.ROOT));
if (privilegeToSupportedVersion.get(privilege.toLowerCase(Locale.ROOT)).onOrAfter(outboundVersion)) {
privilegesMutated.add(privilege);
}
}
....
I think this more clearly maps to the general concept of privileges being introduced on a new given transport version and simplifies the code (e.g., getAllowedPermissionsPerVersion
can go away or only be used for logging).
Beyond that, privilegeToSupportedVersion
IMO could move to ClusterPrivilegeResolver
with the requirement that whenever anyone adds a new privilege, they need to add it to the map (this would require a bit more thought to work out the details and edge cases, but seems like the right direction to me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider this for later, but not now. This would loose the ability to gate multiple privileges per transport versions, which is/was intentional. I think we will need to apply similar logic to other cluster privileges as well as index privileges, so this will likely be refactored at some point.
Set<String> allowedPermissionsPerVersion = getAllowedPermissionsPerVersion(outboundVersion); | ||
for (RemoteClusterPermissionGroup group : remoteClusterPermissionGroups) { | ||
String[] privileges = group.clusterPrivileges(); | ||
List<String> privilegesMutated = new ArrayList<>(privileges.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about privilegesToKeep
or supportedPrivileges
? I think that better captures the semantics of this field. It could be that all privileges are supported and kept, in which case they're not really "muted".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will update to outboundPrivileges
to better match copyForOutboundVersion (thanks that did read kinda funny)
+ "] or a pattern over one of the available " | ||
+ "cluster actions"; | ||
logger.debug(errorMessage); | ||
logger.warn(errorMessage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd make this an info cause it's triggered by client-side errors
x-pack/plugin/build.gradle
Outdated
task.skipTest("esql/60_usage/Basic ESQL usage output (telemetry)", "The telemetry output changed. We dropped a column. That's safe.") | ||
task.skipTest("esql/80_text/reverse text", "The output type changed from TEXT to KEYWORD.") | ||
task.skipTest("esql/80_text/values function", "The output type changed from TEXT to KEYWORD.") | ||
task.replaceValueInLength("cluster", 62, "Test get builtin privileges") // new builtin privileges have been added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I've stared at this for a bit but I don't fully get it -- why will that test have the wrong value for num privileges? Is it because the test spec is somehow sourced from an older ES version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it because the test spec is somehow sourced from an older ES version?
Yup, exactly. The compat tests are sourced from the last minor of the last major (8.x). I also realized that this will break as soon as I back port this code to 8.x, so I will skip it for now. There probably isn't a need to ever re-enable it either, since the number of privileges is not relevant to compatibility.
private static final Logger logger = LogManager.getLogger(RemoteClusterPermissions.class); | ||
private final List<RemoteClusterPermissionGroup> remoteClusterPermissionGroups; | ||
|
||
// package private non-final for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really love to avoid mutable static state... if we ever run unit tests in parallel for instance this'll be rather unpleasant to debug.
Instead of using the static map directly in RemoteClusterPermissions
could we have a wrapper method allowedRemoteClusterPermissions()
, then spy RemoteClusterPermissions
and mock only allowedRemoteClusterPermissions()
according to the test's needs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really love to avoid mutable static state
For clarity, this is only mutable as package private for tests and this is fairly common practice.
if we ever run unit tests in parallel
We never do that in the same JVM. If we did the problems would run much deeper than this concern.
then spy ... and mock
I am not a fan of any usage of mocks beyond the absolute simplest case. IMO they end up adding unnecessary complexity to the tests making the difficult to read and understand behavior (which is imo 1/2 the point of the tests) and anything but simple mocks used to help breakup the dependency graph to test variants is IMO a test smell. The alternative I would choose here to address the concern (which I do agree as-is is kinda hacky) would be to use a package private setter gated by a setonce or the like, but that feels unnecessary.
|
||
public void testRemoveUnsupportedPrivileges() { | ||
RemoteClusterPermissions remoteClusterPermissions = new RemoteClusterPermissions(); | ||
RemoteClusterPermissionGroup group = new RemoteClusterPermissionGroup(new String[] { "monitor_enrich" }, new String[] { "*" }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where best to test this but would be also have a test with duplicate privileges, e.g., instead of only new String[] { "monitor_enrich" }
only test new String[] { "monitor_enrich", "monitor_enrich" }
? I think it's possible to specify roles with dup privileges like that. I don't see any indication in prod code that dups would be a problem but good to cover I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a random boolean (below) to test that duplicates don't influence the removal.
...va/org/elasticsearch/xpack/core/security/authz/permission/RemoteClusterPermissionsTests.java
Outdated
Show resolved
Hide resolved
} | ||
return (String) API_KEY_MAP_REF.get().get("encoded"); | ||
}) | ||
// Define a bogus API key for another remote cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is just copy-paste but I'd remove this and the cluster.remote.wrong_api_key_type.credentials
section too since it distracts from the overall test
assertThat(ObjectPath.evaluate(statsResponseAsMap, "ccs._search.total"), equalTo(4)); | ||
assertThat(ObjectPath.evaluate(statsResponseAsMap, "ccs._search.clusters.my_remote_cluster.total"), equalTo(3)); | ||
int updatedIndexCount = ObjectPath.evaluate(statsResponseAsMap, "ccs.clusters.my_remote_cluster.indices_count"); | ||
assertThat(updatedIndexCount, equalTo(initialIndexCount + 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also add a negative test where a user without monitor_stats
gets empty stats instead
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…ster privileges (elastic#114964) This commit does the following: * Add a new monitor_stats privilege * Ensure that monitor_stats can be set in the remote_cluster privileges * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0 Since this is the first case where there is more than 1 remote_cluster privilege, the following framework concern has been added: * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege previous only supported all or nothing remote_cluster blocks * Ensure that we when sending API key role descriptors that contains remote_cluster, we don't send the new privileges for RCS 1.0/2.0 if it not new enough * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0 (cherry picked from commit af99654)
…te cluster privileges (#114964) (#116517) * Add a monitor_stats privilege and allow that privilege for remote cluster privileges (#114964) This commit does the following: * Add a new monitor_stats privilege * Ensure that monitor_stats can be set in the remote_cluster privileges * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0 Since this is the first case where there is more than 1 remote_cluster privilege, the following framework concern has been added: * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege previous only supported all or nothing remote_cluster blocks * Ensure that we when sending API key role descriptors that contains remote_cluster, we don't send the new privileges for RCS 1.0/2.0 if it not new enough * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0 (cherry picked from commit af99654) * adjust bwc for 8.x branch
…ster privileges (#114964) This commit does the following: * Add a new monitor_stats privilege * Ensure that monitor_stats can be set in the remote_cluster privileges * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0 Since this is the first case where there is more than 1 remote_cluster privilege, the following framework concern has been added: * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege previous only supported all or nothing remote_cluster blocks * Ensure that we when sending API key role descriptors that contains remote_cluster, we don't send the new privileges for RCS 1.0/2.0 if it not new enough * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0
…ster privileges (elastic#114964) This commit does the following: * Add a new monitor_stats privilege * Ensure that monitor_stats can be set in the remote_cluster privileges * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0 Since this is the first case where there is more than 1 remote_cluster privilege, the following framework concern has been added: * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege previous only supported all or nothing remote_cluster blocks * Ensure that we when sending API key role descriptors that contains remote_cluster, we don't send the new privileges for RCS 1.0/2.0 if it not new enough * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0
…ster privileges (elastic#114964) This commit does the following: * Add a new monitor_stats privilege * Ensure that monitor_stats can be set in the remote_cluster privileges * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0 Since this is the first case where there is more than 1 remote_cluster privilege, the following framework concern has been added: * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege previous only supported all or nothing remote_cluster blocks * Ensure that we when sending API key role descriptors that contains remote_cluster, we don't send the new privileges for RCS 1.0/2.0 if it not new enough * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0
This commit does the following:
Since this is the first case where there is more than 1 remote_cluster privilege,
the following framework concern has been added:
previous only supported all or nothing remote_cluster blocks
we don't send the new privileges for RCS 1.0/2.0 if it not new enough