Skip to content

Conversation

jakelandis
Copy link
Contributor

@jakelandis jakelandis commented Oct 16, 2024

This commit does the following:

  • Add a new monitor_stats privilege
  • Ensure that monitor_stats can be set in the remote_cluster privileges
  • Give's Kibana the ability to remotely call monitor_stats via RCS 2.0

Since this is the first case where there is more than 1 remote_cluster privilege,
the following framework concern has been added:

  • Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege
    previous only supported all or nothing remote_cluster blocks
  • Ensure that we when sending API key role descriptors that contains remote_cluster,
    we don't send the new privileges for RCS 1.0/2.0 if it not new enough
  • Fix and extend the BWC tests for RCS 1.0 and RCS 2.0

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@n1v0lg
Copy link
Contributor

n1v0lg commented Nov 5, 2024

Re the CI failure: we also need to update the failing privilege registry test in the serverless repo, by adding the new priv there. This is currently unnecessarily a two-step process. I have a v old PR open to make it a one step process instead (#110178).

Copy link
Contributor

@n1v0lg n1v0lg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like the BWC-handling this introduces. I have a high level question before proceeding: for RCS 1.0, the new privilege will be included in the cluster section, not the remote_cluster section so we need to handle BWC there too, otherwise CCS becomes unusable with API keys with that cluster privilege (see my inline comment here). Let me know if I misunderstood s.th. though!

roleName,
currentFieldName
);
logger.warn(message);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only concern is that it's a warn log that's triggered by client-side errors -- if we ever monitor for WARN log rates, a user with a loop and a misconception of our APIs can get noisy, without any recourse for us. I'd go with INFO here tbh

* and {@link Role}. This model is not intended to be sent to a remote cluster, but can be (wire) serialized within a single cluster
* as well as the Xcontent serialization for the REST API and persistence of the role in the security index. The privileges modeled here
* will be converted to the appropriate cluster privileges when sent to a remote cluster.
* and {@link Role}. This model is not intended to support RCS 2.0 where these remote cluster permissions are converted to local cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this bit "This model is not intended to support RCS 2.0" is quite confusing -- the sole reason remote_cluster permissions exist is to support RCS 2.0 (for RCS 1.0 it's just cluster permissions). I think the Javadoc before this change was clearer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just a case of an extra "not" though...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah.. i must have written that pretty fast. will update.

// Update remote cluster settings on QC.
setupQueryClusterRemoteClusters(useProxyMode);
// Ensure remote cluster is connected
if (isRCS2()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested this but you should be able to replace this whole block with configureRemoteCluster(REMOTE_CLUSTER_ALIAS, fulfillingCluster, isRCS2(), randomBoolean(), false); super.setUp();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond that, I'd actually leave this method abstract and just let the concrete implementations invoke the method with the right parameters instead of using isRCS2() here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, that cleaned it up a bit, but I needed to keep isRCS2() to conditionally add or not add the mirror'ed role for RCS 1.0 to the fulfilling cluster.

@jakelandis
Copy link
Contributor Author

jakelandis commented Nov 6, 2024

Re the CI failure: we also need to update the failing privilege registry test in the serverless repo, by adding the new priv there.

I had to temporarily mute the test to avoid a deadlock between the repo's. I linked the 2 PR': review requested for issue 3103.

Copy link
Contributor

@n1v0lg n1v0lg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - few suggestions and smaller test fixes but nothing that requires another review 👍

As follow ups, we'll also want to add the new privilege to https://github.com/elastic/elasticsearch-specification/blob/main/specification/security/_types/Privileges.ts#L43 and mark as non-serverless-supported.

}
} else if (authentication.getEffectiveSubject()
.getTransportVersion()
.onOrAfter(TRANSPORT_VERSION_ADVANCED_REMOTE_CLUSTER_SECURITY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think TRANSPORT_VERSION_ADVANCED_REMOTE_CLUSTER_SECURITY should be ROLE_REMOTE_CLUSTER_PRIVS -- otherwise, we'll include versions for which there is no way that the authentication subject would ever have a remote_cluster field to begin with.

Also, I think a comment here explaining that maybeRemoveRemoteClusterPrivilegesFromRoleDescriptors handles more specific transport version checks would help here I think. It breaks with the pattern of

if (authentication.onOrAfter(version) && streamVersions.before(version)) { do something } and looks a bit like a copy-paste bug at first glance

Copy link
Contributor Author

@jakelandis jakelandis Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think TRANSPORT_VERSION_ADVANCED_REMOTE_CLUSTER_SECURITY should be ROLE_REMOTE_CLUSTER_PRIVS -- otherwise, we'll include versions for which there is no way that the authentication subject would ever have a remote_cluster field to begin with.

I'll change it, but don't think it matters. IIUC this specific check is really on needed for being able to know, in the local cluster, for any persisted authentication (e.g. watcher, ilm) done with an API Key, if we are working with a byte[] representation or the older format for persisted Authentication objects. So I believe that any value after VERSION_API_KEY_ROLES_AS_BYTES would work fine. I don't have specific test for this, but since we convert byte[] to Map, this check should only be concerned with VERSION_API_KEY_ROLES_AS_BYTES in this cluster. The other check && streamVersion.onOrAfter(ROLE_REMOTE_CLUSTER_PRIVS)) { will protect against any problems with the remote understanding remote_cluster.

For example, if the persisted authentication had the transport version before ROLE_REMOTE_CLUSTER_PRIVS (and sending to a new cluster is new enough), then execution will fall into the else condition but the persisted authentication object would never have a remote_cluster entry and effectively do nothing (OK it does a bit of necessary work). It won't blow up since we are just manipulating json/byte[] and maps. I agree that it confusing as-is and using ROLE_REMOTE_CLUSTER_PRIVS probably helps a bit with readability (and avoid some uncessary work). Long term, we should look for a way to stop explicitly rewriting these things and make it a first class attribute of model that understands authentication and transport version concerns. Fundamentally all of the checks in this method does the same thing...strip out things that streams of specific versions won't understand and IMO it plausible to have a single stragegy (for both RCS and node-to-node in same cluster).

(also updated comments to hopefull be bit more clear)

private final List<RemoteClusterPermissionGroup> remoteClusterPermissionGroups;

// package private non-final for testing
static Map<TransportVersion, Set<String>> allowedRemoteClusterPermissions = Map.of(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an optional refactoring suggestion (feel free to reject):

I think it's cleaner and more natural to maintain a map of privilege to version it was introduced, so:

private static Map<String, TransportVersion> privilegeToSupportedVersion = Map.ofEntries(
    Map.entry(ClusterPrivilegeResolver.MONITOR_ENRICH.name(), ROLE_REMOTE_CLUSTER_PRIVS),
    Map.entry(ClusterPrivilegeResolver.MONITOR_STATS.name(), ROLE_MONITOR_STATS)
);

Then, removeUnsupportedPrivileges becomes:

...
for (String privilege : privileges) {
    assert privilegeToSupportedVersion.containsKey(privilege.toLowerCase(Locale.ROOT));
    if (privilegeToSupportedVersion.get(privilege.toLowerCase(Locale.ROOT)).onOrAfter(outboundVersion)) {
        privilegesMutated.add(privilege);
    }
}
....

I think this more clearly maps to the general concept of privileges being introduced on a new given transport version and simplifies the code (e.g., getAllowedPermissionsPerVersion can go away or only be used for logging).

Beyond that, privilegeToSupportedVersion IMO could move to ClusterPrivilegeResolver with the requirement that whenever anyone adds a new privilege, they need to add it to the map (this would require a bit more thought to work out the details and edge cases, but seems like the right direction to me).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider this for later, but not now. This would loose the ability to gate multiple privileges per transport versions, which is/was intentional. I think we will need to apply similar logic to other cluster privileges as well as index privileges, so this will likely be refactored at some point.

Set<String> allowedPermissionsPerVersion = getAllowedPermissionsPerVersion(outboundVersion);
for (RemoteClusterPermissionGroup group : remoteClusterPermissionGroups) {
String[] privileges = group.clusterPrivileges();
List<String> privilegesMutated = new ArrayList<>(privileges.length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about privilegesToKeep or supportedPrivileges? I think that better captures the semantics of this field. It could be that all privileges are supported and kept, in which case they're not really "muted".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update to outboundPrivileges to better match copyForOutboundVersion (thanks that did read kinda funny)

+ "] or a pattern over one of the available "
+ "cluster actions";
logger.debug(errorMessage);
logger.warn(errorMessage);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd make this an info cause it's triggered by client-side errors

task.skipTest("esql/60_usage/Basic ESQL usage output (telemetry)", "The telemetry output changed. We dropped a column. That's safe.")
task.skipTest("esql/80_text/reverse text", "The output type changed from TEXT to KEYWORD.")
task.skipTest("esql/80_text/values function", "The output type changed from TEXT to KEYWORD.")
task.replaceValueInLength("cluster", 62, "Test get builtin privileges") // new builtin privileges have been added
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I've stared at this for a bit but I don't fully get it -- why will that test have the wrong value for num privileges? Is it because the test spec is somehow sourced from an older ES version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it because the test spec is somehow sourced from an older ES version?

Yup, exactly. The compat tests are sourced from the last minor of the last major (8.x). I also realized that this will break as soon as I back port this code to 8.x, so I will skip it for now. There probably isn't a need to ever re-enable it either, since the number of privileges is not relevant to compatibility.

private static final Logger logger = LogManager.getLogger(RemoteClusterPermissions.class);
private final List<RemoteClusterPermissionGroup> remoteClusterPermissionGroups;

// package private non-final for testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really love to avoid mutable static state... if we ever run unit tests in parallel for instance this'll be rather unpleasant to debug.

Instead of using the static map directly in RemoteClusterPermissions could we have a wrapper method allowedRemoteClusterPermissions(), then spy RemoteClusterPermissions and mock only allowedRemoteClusterPermissions() according to the test's needs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really love to avoid mutable static state

For clarity, this is only mutable as package private for tests and this is fairly common practice.

if we ever run unit tests in parallel

We never do that in the same JVM. If we did the problems would run much deeper than this concern.

then spy ... and mock

I am not a fan of any usage of mocks beyond the absolute simplest case. IMO they end up adding unnecessary complexity to the tests making the difficult to read and understand behavior (which is imo 1/2 the point of the tests) and anything but simple mocks used to help breakup the dependency graph to test variants is IMO a test smell. The alternative I would choose here to address the concern (which I do agree as-is is kinda hacky) would be to use a package private setter gated by a setonce or the like, but that feels unnecessary.


public void testRemoveUnsupportedPrivileges() {
RemoteClusterPermissions remoteClusterPermissions = new RemoteClusterPermissions();
RemoteClusterPermissionGroup group = new RemoteClusterPermissionGroup(new String[] { "monitor_enrich" }, new String[] { "*" });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where best to test this but would be also have a test with duplicate privileges, e.g., instead of only new String[] { "monitor_enrich" } only test new String[] { "monitor_enrich", "monitor_enrich" }? I think it's possible to specify roles with dup privileges like that. I don't see any indication in prod code that dups would be a problem but good to cover I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a random boolean (below) to test that duplicates don't influence the removal.

}
return (String) API_KEY_MAP_REF.get().get("encoded");
})
// Define a bogus API key for another remote cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just copy-paste but I'd remove this and the cluster.remote.wrong_api_key_type.credentials section too since it distracts from the overall test

assertThat(ObjectPath.evaluate(statsResponseAsMap, "ccs._search.total"), equalTo(4));
assertThat(ObjectPath.evaluate(statsResponseAsMap, "ccs._search.clusters.my_remote_cluster.total"), equalTo(3));
int updatedIndexCount = ObjectPath.evaluate(statsResponseAsMap, "ccs.clusters.my_remote_cluster.indices_count");
assertThat(updatedIndexCount, equalTo(initialIndexCount + 1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also add a negative test where a user without monitor_stats gets empty stats instead

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Nov 7, 2024
@jakelandis jakelandis merged commit af99654 into elastic:main Nov 8, 2024
21 checks passed
@jakelandis
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

jakelandis added a commit to jakelandis/elasticsearch that referenced this pull request Nov 8, 2024
…ster privileges (elastic#114964)

This commit does the following:
   * Add a new monitor_stats privilege
   * Ensure that monitor_stats can be set in the remote_cluster privileges
   * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0

Since this is the first case where there is more than 1 remote_cluster privilege,
the following framework concern has been added:
    * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege
        previous only supported all or nothing remote_cluster blocks
    * Ensure that we when sending API key role descriptors that contains remote_cluster,
       we don't send the new privileges for RCS 1.0/2.0 if it not new enough
    * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0

(cherry picked from commit af99654)
elasticsearchmachine pushed a commit that referenced this pull request Nov 8, 2024
…te cluster privileges (#114964) (#116517)

* Add a monitor_stats privilege and allow that privilege for remote cluster privileges (#114964)

This commit does the following:
   * Add a new monitor_stats privilege
   * Ensure that monitor_stats can be set in the remote_cluster privileges
   * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0

Since this is the first case where there is more than 1 remote_cluster privilege,
the following framework concern has been added:
    * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege
        previous only supported all or nothing remote_cluster blocks
    * Ensure that we when sending API key role descriptors that contains remote_cluster,
       we don't send the new privileges for RCS 1.0/2.0 if it not new enough
    * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0

(cherry picked from commit af99654)

* adjust bwc for 8.x branch
jozala pushed a commit that referenced this pull request Nov 13, 2024
…ster privileges (#114964)

This commit does the following:
   * Add a new monitor_stats privilege
   * Ensure that monitor_stats can be set in the remote_cluster privileges
   * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0

Since this is the first case where there is more than 1 remote_cluster privilege,
the following framework concern has been added:
    * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege
        previous only supported all or nothing remote_cluster blocks
    * Ensure that we when sending API key role descriptors that contains remote_cluster,
       we don't send the new privileges for RCS 1.0/2.0 if it not new enough
    * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0
afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Nov 14, 2024
…ster privileges (elastic#114964)

This commit does the following:
   * Add a new monitor_stats privilege
   * Ensure that monitor_stats can be set in the remote_cluster privileges
   * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0

Since this is the first case where there is more than 1 remote_cluster privilege,
the following framework concern has been added:
    * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege
        previous only supported all or nothing remote_cluster blocks
    * Ensure that we when sending API key role descriptors that contains remote_cluster,
       we don't send the new privileges for RCS 1.0/2.0 if it not new enough
    * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0
alexey-ivanov-es pushed a commit to alexey-ivanov-es/elasticsearch that referenced this pull request Nov 28, 2024
…ster privileges (elastic#114964)

This commit does the following:
   * Add a new monitor_stats privilege
   * Ensure that monitor_stats can be set in the remote_cluster privileges
   * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0

Since this is the first case where there is more than 1 remote_cluster privilege,
the following framework concern has been added:
    * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege
        previous only supported all or nothing remote_cluster blocks
    * Ensure that we when sending API key role descriptors that contains remote_cluster,
       we don't send the new privileges for RCS 1.0/2.0 if it not new enough
    * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC serverless-linked Added by automation, don't add manually Team:Security Meta label for security team v8.17.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants