-
Notifications
You must be signed in to change notification settings - Fork 775
feat: multi SCP composition #8917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: multi SCP composition #8917
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
🔍 Preview links for changed docs |
1ff62b3 to
c442adf
Compare
f5ee58f to
6353c66
Compare
6353c66 to
2c4aa4b
Compare
2c4aa4b to
400b02d
Compare
|
Just a general question/comment that came to my mind, do we want to have any limits on the number of SCPs that can be associated to a cluster ? |
pebrc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass, just looking at the code. I have not tested it yet. Will try to find some more time later today.
| sMntAggr := secretMountsAggregator{} | ||
| sSrcAggr := secretSourceAggregator{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| sMntAggr := secretMountsAggregator{} | |
| sSrcAggr := secretSourceAggregator{} | |
| secretMounts := secretMountsAggregator{} | |
| secretSources := secretSourceAggregator{} |
I find the variable names otherwise a bit hard to follow. secretMountsAggr if you want to keep the difference between the actual sources and the aggregator visibile in the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 3bf7a08
| // Check namespace restrictions; the policy must be in operator namespace or same namespace as the target object | ||
| if policy.Namespace != operatorNamespace && policy.Namespace != obj.GetNamespace() { | ||
| return false, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the cheapest check first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that I have merged main here (commit 0f5d39b) this should be fixed
| if dst == nil { | ||
| dst = &commonv1.Config{} | ||
| dstCanonicalConfig = settings.NewCanonicalConfig() | ||
| } else { | ||
| dstCanonicalConfig, err = settings.NewCanonicalConfigFrom(dst.DeepCopy().Data) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if dst == nil { | |
| dst = &commonv1.Config{} | |
| dstCanonicalConfig = settings.NewCanonicalConfig() | |
| } else { | |
| dstCanonicalConfig, err = settings.NewCanonicalConfigFrom(dst.DeepCopy().Data) | |
| if err != nil { | |
| return nil, err | |
| } | |
| } | |
| var err error | |
| if dst == nil { | |
| return src.DeepCopy(), nil | |
| } | |
| dstCanonicalConfig, err = settings.NewCanonicalConfigFrom(dst.DeepCopy().Data) | |
| if err != nil { | |
| return nil, err | |
| } |
That also allows to get rid of the special casing for the single policy case at least for Kibana.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed in 3bf7a08
| mergedPolicies := 0 | ||
| sSrcAggr := secretSourceAggregator{} | ||
| cfgPolicy := &configPolicy[policyv1alpha1.KibanaConfigPolicySpec]{ | ||
| extractFunc: func(p *policyv1alpha1.StackConfigPolicy) policyv1alpha1.KibanaConfigPolicySpec { | ||
| return p.Spec.Kibana | ||
| }, | ||
| mergeFunc: func(dst *policyv1alpha1.KibanaConfigPolicySpec, src policyv1alpha1.KibanaConfigPolicySpec, srcPolicy *policyv1alpha1.StackConfigPolicy) error { | ||
| if mergedPolicies == 0 { | ||
| // First policy: copy directly without merging/canonicalizing to avoid unnecessary differences | ||
| specCopy := src.DeepCopy() | ||
| // SecureSettings are aggregated by sSrcAggr | ||
| specCopy.SecureSettings = nil | ||
| *dst = *specCopy | ||
| } else { | ||
| var err error | ||
| if dst.Config, err = deepMergeConfig(dst.Config, src.Config); err != nil { | ||
| return err | ||
| } | ||
| } | ||
|
|
||
| sSrcAggr.aggregate(srcPolicy.Spec.Kibana.SecureSettings, srcPolicy) | ||
| mergedPolicies++ | ||
| return nil | ||
| }, | ||
| } | ||
|
|
||
| if err := merge(cfgPolicy, kbn, allPolicies, params.OperatorNamespace); err != nil { | ||
| return cfgPolicy, err | ||
| } | ||
| cfgPolicy.SecretSources = sSrcAggr.namespacedSecretSources | ||
| return cfgPolicy, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| mergedPolicies := 0 | |
| sSrcAggr := secretSourceAggregator{} | |
| cfgPolicy := &configPolicy[policyv1alpha1.KibanaConfigPolicySpec]{ | |
| extractFunc: func(p *policyv1alpha1.StackConfigPolicy) policyv1alpha1.KibanaConfigPolicySpec { | |
| return p.Spec.Kibana | |
| }, | |
| mergeFunc: func(dst *policyv1alpha1.KibanaConfigPolicySpec, src policyv1alpha1.KibanaConfigPolicySpec, srcPolicy *policyv1alpha1.StackConfigPolicy) error { | |
| if mergedPolicies == 0 { | |
| // First policy: copy directly without merging/canonicalizing to avoid unnecessary differences | |
| specCopy := src.DeepCopy() | |
| // SecureSettings are aggregated by sSrcAggr | |
| specCopy.SecureSettings = nil | |
| *dst = *specCopy | |
| } else { | |
| var err error | |
| if dst.Config, err = deepMergeConfig(dst.Config, src.Config); err != nil { | |
| return err | |
| } | |
| } | |
| sSrcAggr.aggregate(srcPolicy.Spec.Kibana.SecureSettings, srcPolicy) | |
| mergedPolicies++ | |
| return nil | |
| }, | |
| } | |
| if err := merge(cfgPolicy, kbn, allPolicies, params.OperatorNamespace); err != nil { | |
| return cfgPolicy, err | |
| } | |
| cfgPolicy.SecretSources = sSrcAggr.namespacedSecretSources | |
| return cfgPolicy, nil | |
| secretSources := secretSourceAggregator{} | |
| cfgPolicy := &configPolicy[policyv1alpha1.KibanaConfigPolicySpec]{ | |
| extractFunc: func(p *policyv1alpha1.StackConfigPolicy) policyv1alpha1.KibanaConfigPolicySpec { | |
| return p.Spec.Kibana | |
| }, | |
| mergeFunc: func(dst *policyv1alpha1.KibanaConfigPolicySpec, src policyv1alpha1.KibanaConfigPolicySpec, srcPolicy *policyv1alpha1.StackConfigPolicy) error { | |
| var err error | |
| if dst.Config, err = deepMergeConfig(dst.Config, src.Config); err != nil { | |
| return err | |
| } | |
| secretSources.aggregate(srcPolicy.Spec.Kibana.SecureSettings, srcPolicy) | |
| return nil | |
| }, | |
| } | |
| if err := merge(cfgPolicy, kbn, allPolicies, params.OperatorNamespace); err != nil { | |
| return cfgPolicy, err | |
| } | |
| cfgPolicy.SecretSources = secretSources.namespacedSecretSources | |
| return cfgPolicy, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Special casing not necessary if the merge function optimises for the single policy case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed in 3bf7a08
| if dst == nil { | ||
| dst = &commonv1.Config{} | ||
| dstCanonicalConfig = settings.NewCanonicalConfig() | ||
| } else { | ||
| dstCanonicalConfig, err = settings.NewCanonicalConfigFrom(dst.DeepCopy().Data) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if dst == nil { | |
| dst = &commonv1.Config{} | |
| dstCanonicalConfig = settings.NewCanonicalConfig() | |
| } else { | |
| dstCanonicalConfig, err = settings.NewCanonicalConfigFrom(dst.DeepCopy().Data) | |
| if err != nil { | |
| return nil, err | |
| } | |
| } | |
| if dst == nil { | |
| return src.DeepCopy(), nil | |
| } | |
| dstCanonicalConfig, err = settings.NewCanonicalConfigFrom(dst.DeepCopy().Data) | |
| if err != nil { | |
| return nil, err | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed in 3bf7a08
| specCopy.SecureSettings = nil | ||
| // SecretMounts are aggregated by sMntAggr | ||
| specCopy.SecretMounts = nil | ||
| *dst = *specCopy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This special casing can be removed if the merge function handles preserves the original structure when the destination is empty. I think the code becomes a little bit easier to reason about without the special cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed in 3bf7a08
| for _, secretMount := range src { | ||
| if existingPolicy, exists := s.policiesBySecretName[secretMount.SecretName]; exists { | ||
| existingPolicyNsn := k8s.ExtractNamespacedName(existingPolicy) | ||
| err := fmt.Errorf("%w: secret with name %q is defined in policies %q, %q", errMergeConflict, secretMount.SecretName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| err := fmt.Errorf("%w: secret with name %q is defined in policies %q, %q", errMergeConflict, secretMount.SecretName, | |
| err := fmt.Errorf("%w: secret with name %q is defined in policy %q, %q", errMergeConflict, secretMount.SecretName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed in e3639c5
| } | ||
| if existingPolicy, exists := s.policiesByMountPath[secretMount.MountPath]; exists { | ||
| existingPolicyNsn := k8s.ExtractNamespacedName(existingPolicy) | ||
| err := fmt.Errorf("%w: secret mount path %q is defined in policies %q, %q", errMergeConflict, secretMount.MountPath, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| err := fmt.Errorf("%w: secret mount path %q is defined in policies %q, %q", errMergeConflict, secretMount.MountPath, | |
| err := fmt.Errorf("%w: secret mount path %q is defined in policy %q, %q", errMergeConflict, secretMount.MountPath, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed in e3639c5
| // aggregate merges source secure settings into the aggregator, organizing them by the source | ||
| // policy's namespace. Secret sources are sorted deterministically when multiple policies have | ||
| // been applied to ensure consistent results. | ||
| func (s *secretSourceAggregator) aggregate(src []commonv1.SecretSource, srcPolicy *policyv1alpha1.StackConfigPolicy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it confusing that both aggregators use the same method name but do differnent things. Also one has a more functional style returning the intermediate results:
func (s *secretMountsAggregator) aggregate(
dst []policyv1alpha1.SecretMount,
src []policyv1alpha1.SecretMount,
srcPolicy *policyv1alpha1.StackConfigPolicy,
) ([]policyv1alpha1.SecretMount, error)while the other is more imperative:
func (s *secretSourceAggregator) aggregate(
src []commonv1.SecretSource,
srcPolicy *policyv1alpha1.StackConfigPolicy,
)Can we maybe use different method names?
secretMountAggr.mergeInto(dst, src, policy) - emphasizes it modifies dst
secretSourceAggr.addFrom(src, policy) - emphasizes it adds to internal state
or make them both imperative/stateful:
secretMountAggr.add(src, policy) error
secretSourceAggr.add(src, policy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok in 3bf7a08 both of these now should be under the same functional style
| // aggregate merges source secret mounts into destination, checking for conflicts on secret names | ||
| // and mount paths. Returns the merged slice of secret mounts sorted deterministically when | ||
| // multiple policies have been applied, or an error if conflicts are detected. | ||
| func (s *secretMountsAggregator) aggregate(dst []policyv1alpha1.SecretMount, src []policyv1alpha1.SecretMount, srcPolicy *policyv1alpha1.StackConfigPolicy) ([]policyv1alpha1.SecretMount, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe break the long parameter lists into multiple lines (or my monitor is too small 😄 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
broken in multiple lines in 3bf7a08
| // This uses labels (reconciler.SoftOwnerKindLabel, reconciler.SoftOwnerNameLabel, reconciler.SoftOwnerNamespaceLabel) | ||
| // to store the ownership relationship, allowing the policy to manage | ||
| // the Secret's lifecycle without using Kubernetes OwnerReferences. | ||
| func setSingleSoftOwner(secret *corev1.Secret, policy policyv1alpha1.StackConfigPolicy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only used in test code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the plan was to use this one also instead of the call to filesettings.SetSoftOwner but this one slipped through the cracks. Either way, I am gonna revisit this approach based on your comment below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 025c76f
| } | ||
|
|
||
| // Check for multi-policy ownership (annotation-based) | ||
| if ownerRefsBytes, exists := obj.GetAnnotations()[SoftOwnerRefsAnnotation]; exists { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a bit awkward that some of the multi-ref functionality is here and some is in ownership.go?
Maybe move all the code here and make it generic?
// In pkg/controller/common/reconciler/secret.go
// SetMultipleSoftOwners adds multiple soft owner references to an object
func SetMultipleSoftOwners(obj metav1.Object, ownerKind string, owners []types.NamespacedName) error
// AddSoftOwner adds a soft owner to an object (handles single->multi transition)
func AddSoftOwner(obj metav1.Object, owner SoftOwnerRef) error
// RemoveSoftOwner removes a soft owner from an object
func RemoveSoftOwner(obj metav1.Object, owner types.NamespacedName) (remainingCount int, err error)
// IsSoftOwnedBy checks if an object is soft-owned by the given owner
func IsSoftOwnedBy(obj metav1.Object, ownerKind string, owner types.NamespacedName) (bool, error)
// SoftOwnerRefs already exists and reads them
func SoftOwnerRefs(obj metav1.Object) ([]SoftOwnerRef, error)and then call it from the stack config controller, maybe with small wrappers to make the transformation from SCP to NamespacedName where necessary:
func setMultipleSoftOwners(secret *corev1.Secret, policies []policyv1alpha1.StackConfigPolicy) error {
owners := make([]types.NamespacedName, len(policies))
for i, p := range policies {
owners[i] = k8s.ExtractNamespacedName(&p)
}
return reconciler.SetMultipleSoftOwners(secret, policyv1alpha1.Kind, owners)
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relocated in 025c76f
| owned, err := isPolicySoftOwner(&s, softOwner) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if !owned { | ||
| continue | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| owned, err := isPolicySoftOwner(&s, softOwner) | |
| if err != nil { | |
| return err | |
| } | |
| if !owned { | |
| continue | |
| } | |
| if owned, err := isPolicySoftOwner(&secret, softOwner); err != nil { | |
| return err | |
| } else if !owned { | |
| continue | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed in 025c76f
| {&dst.SnapshotRepositories, src.SnapshotRepositories, mergeConfig}, | ||
| {&dst.SnapshotLifecyclePolicies, src.SnapshotLifecyclePolicies, deepMergeConfig}, | ||
| {&dst.SecurityRoleMappings, src.SecurityRoleMappings, deepMergeConfig}, | ||
| {&dst.IndexLifecyclePolicies, src.IndexLifecyclePolicies, deepMergeConfig}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a problem with null values here. I assume when multiple policies apply and we canonicalize the configs (this happened when I went from 2 to 3 SCPs and depends on the order of merging i.e. the weights):
java.lang.IllegalStateException: Error processing state change request for file_settings, errors: org.elasticsearch.xcontent.XContentParseException: [1:1112] [reserved_state_chunk] failed to parse field [state]
Caused by: org.elasticsearch.xcontent.XContentParseException: [1:1112] [state] failed to parse field [ilm]
Caused by: org.elasticsearch.xcontent.XContentParseException: [1:131] [lifecycle_policy] failed to parse field [phases]
Caused by: org.elasticsearch.xcontent.XContentParseException: [1:131] [phases] failed to parse field [delete]
Caused by: org.elasticsearch.xcontent.XContentParseException: [1:131] [phase] failed to parse field [actions]
Caused by: org.elasticsearch.xcontent.XContentParseException: [1:131] [actions] failed to parse field [delete]
Caused by: org.elasticsearch.xcontent.XContentParseException: [1:131] [delete] Expected START_OBJECT but was: VALUE_NULL
Using the example from our docs https://www.elastic.co/docs/deploy-manage/deploy/cloud-on-k8s/elastic-stack-configuration-policies
"ilm": {
"ilm_test": {
"phases": {
"delete": {
"actions": {
"delete": {}
},
"min_age": "30d"
},
"warm": {
"actions": {
"forcemerge": {
"max_num_segments": 1
}
},
"min_age": "10d"
}
}
}
},becomes
"ilm": {
"ilm_test": {
"phases": {
"delete": {
"actions": {
"delete": null
},
"min_age": "30d"
},
"warm": {
"actions": {
"forcemerge": {
"max_num_segments": 1
}
},
"min_age": "10d"
}
}
}
},There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need a truly shallow merge for everything that is not elasticsearch.yml and clusterSettings.
Something like:
func mergeConfig(dst *commonv1.Config, src *commonv1.Config) (*commonv1.Config, error) {
if src == nil {
return dst, nil
}
if dst == nil {
// No destination, just copy source
return src.DeepCopy(), nil
}
if dst.Data == nil {
dst.Data = make(map[string]interface{})
}
// Shallow merge: copy top-level keys from src to dst
// This preserves empty maps since we're not going through canonicalization
if src.Data != nil {
maps.Copy(dst.Data, src.Data)
}
return dst, nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for catching that @pebrc , I will run further tests and try to pinpoint if this originates from go-ucfg or yaml marshalling/unmarshalling. Now about using only shallow merges how would that look like if we use dot-separated keys the other configs aside elasticsearch.yml and clusterSettings? Also sorry if I missed it but what prevents the same thing from happening for elasticsearch.yml and clusterSettings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I investigated this further - the issue is in the go-ucfg library during Unpack, where empty maps are replaced with nil. I considered replacing go-ucfg entirely for canonicalization and deep merging (I had success with custom code), but didn't push those changes to avoid expanding this PR's scope. This might be worth a follow-up PR.
In 799744b, I changed the merging behavior to support canonicalization and deep merging only for Config and ClusterSettings fields. Everything else uses shallow merging. I also added support for merging the deprecated SecureSettings at the SCP spec root, which I missed when testing your SCPs @pebrc.
dfedf81 to
ad6e3b1
Compare
|
buildkite test this -f p=gke,t=TestStackConfigPolicy* |
|
buildkite test this -f p=gke,t=TestStackConfigPolicy* |
Overview
This PR implements support for multiple StackConfigPolicies (SCPs) targeting the same Elasticsearch cluster or Kibana instance, using a weight-based priority system for deterministic policy composition.
Key Features
Weight-Based Priority System
0Conflict Detection
Conflicts are detected across multiple dimensions and will prevent policy application:
SecretMountwith sameSecretNameSecretMountwith sameMountPathImportant: Even if two policies with the same weight have non-overlapping resources, they still conflict because the weight collision makes the merge order ambiguous.
Configuration Merging Behaviour
Different merge strategies are applied based on the configuration type:
Deep Merge (recursive merging):
ClusterSettingsConfigSnapshotLifecyclePoliciesSecurityRoleMappingsIndexLifecyclePoliciesIngestPipelinesIndexTemplates.ComposableIndexTemplatesIndexTemplates.ComponentTemplatesTop-Level Key Replacement (entire keys replaced):
SnapshotRepositories- each repository configuration is treated atomicallyUnion Merge (with conflict detection):
SecretMounts- conflicts on duplicateSecretNameOR duplicateMountPathSecureSettings- merges bySecretName+Key, lower weight wins (no conflicts)Multi-Soft-Owner Secret Management
File Settings and Policy Config Secrets:
eck.k8s.elastic.co/owner-refsannotation with JSON-encoded map of owner namespaced namesSecret Sources:
This prevents secret leakage while enabling proper cleanup when policies are deleted.
Related Issues