Skip to content

Task: Implement 'Service Resources Catalog' for existing Service Providers #314

@Diaphteiros

Description

@Diaphteiros

Understand the Task

Description

In v2, a user creates an MCPv2 and additional 'service resources' (e.g. Landscaper) next to it. When deleting, the MCP may only be deleted after all services have been removed, because the cluster itself is managed via the MCPv2 resource. However, since in v1, users had to simply delete the MCP resource and all services were removed automatically, it is likely that users will try to remove just the MCPv2 too and not delete the service resources beforehand. This will lead to either of these situations:

  • the MCPv2 is stuck in deletion, because the service has a finalizer on the MCPv2 resource which is not removed because the service resource itself is not in deletion
  • the service does not have a finalizer on the MCPv2 resource, which means that the cluster will be deleted without the service having been removed, leaking at least an orphaned service resource and potentially even more, depending on the service

Neither of this is desirable. Ideally, the MCPv2 resource would trigger the deletion of all related service resources when it gets its deletion timestamp and then wait until they are removed. Due to our extensibility concept, the MCP controller does not know which service resources could exist though, so there needs to be some kind of 'service resources catalog' that allows it to look up this information.

After some experiments and discussions, we decided for the following approach:

  • Each service provider is required to expose its own service resources in the status of its ServiceProvider resource.
    • For example, the ServiceProvider Landscaper should, during initialization, fetch the ServiceProvider resource with the name it got via the --provider-name argument from the platform cluster and add an entry with group landscaper.services.openmcp.cloud, version v1alpha2, and kind Landscaper to its status.resources.
    • This is a list, but it will usually just contain a single entry, unless the same service provider handles multiple services (= service resources).
  • Also, when a service provider reconciles its service resource, it must ensure that the corresponding MCPv2 resource has a finalizer in the form of services.openmcp.cloud/<name>, where <name> is the name of the ServiceProvider resource (which the controller got via its --provider-name argument).
    • Example: If the Landscaper ServiceProvider resource is named laas, then each MCPv2 with a Landscaper resource next to it must have a finalizer services.openmcp.cloud/laas.
  • During the deletion of an MCPv2, the MCP controller checks all finalizers with the services.openmcp.cloud/ prefix, fetches the corresponding ServiceProvider resources from the platform cluster, checks their exposed resources and triggers the deletion of all of these resources with the same name and namespace as the MCPv2 that is being deleted.

While this adds some unintuitive logic to the service provider contract, it allows users to simply delete the MCPv2 resource without having to worry about its services.

Problem

This is neither documented (I thought I did, but I couldn't find it), nor does any of our existing service providers implement this logic.

The PlatformService MCP (which reconciles the MCPv2 resources) already implements its part of this contract (resource lookup via ServiceProvider resource and triggering their deletion).

Any further valuable resources.

What is required to accept the Task as done.

Done Criteria

  • Document this, it is a crucial part of the service provider contract.
  • Implement for all existing service providers
    • Landscaper
    • Crossplane
  • Stretch: Think about whether we want enforce/validate this behavior in service providers and how.
    • Currently, a service provider will work just fine without having this logic implemented, it will only cause problems during deletion. Maybe it would be a good idea preventing a service from working if this contract isn't met?
      • An idea would be to not grant AccessRequests for the MCP cluster if the finalizer is not there and the service resources are not exposed in the ServiceProvider's status, but this is not trivial to do.
  • Present in review.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/open-mcpAll ManagedControlPlane related issueskind/taskGeneral task that needs to be done.needs/validationVerify Issue and Prio with PO

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions