add documentation

xrstf · xrstf · commit 54efd9becbfd · 2024-12-11T14:45:02.000+01:00
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,157 @@
+# The Servlet
+
+The Servlet is a Kubernetes agent responsible for integrating external Kubernetes clusters.
+It runs on a Kubernetes cluster, is configured with credentials to a kcp instance and will then
+synchronize data out of kcp (i.e. out of kcp workspaces) onto the local cluster, and vice versa.
+
+The name Servlet is an obvious reference to the "kubelet" in a regular Kubernetes cluster.
+
+## High-level Overview
+
+The intended usecase follows roughly these steps:
+
+1. A user in KDP with sufficient permissions creates an `APIExport` object and provides appropriate
+   credentials for the Servlet (e.g. by creating a Kubernetes Secret with a preconfigured kubeconfig
+   in it).
+3. A service owner will now take these credentials and the configured API group (the `APIExport`'s
+   name) and use them to setup the Servlet. It is assumed that the service owner (i.e. the
+   cluster-admin in a service cluster) wants to make some resources (usually CRDs) available to use
+   inside of kcp.
+4. The service owner uses the Servlet Helm chart (or similar deployment technique) to install the
+   Servlet in their cluster.
+5. To actually make resources available in the platform, the service owner now has to create a
+   set of `PublishedResource` objects. The configuration happens from their point of view, meaning
+   they define how to publish a CRD to the platform, defining renaming rules and other projection
+   settings.
+6. Once a `PublishedResource` is created in the service cluster, the Servlet will pick it up,
+   find the referenced CRD, convert/project this CRD into an `APIResourceSchema` (ARS) for kcp and
+   then create the ARS in org workspace.
+7. Finally the Servlet will take all `PublishedResources` and bundle them into the pre-existing
+   `APIExport` in the org workspace. This APIExport can then be bound in the org workspace itself
+   (or later any workspaces (depending on permissions)) and be used there.
+8. kcp automatically provides a virtual workspace for the `APIExport` and this is what the Servlet
+   then uses to watch all objects for the relevant resources in the platform (i.e. in all workspaces).
+9. The Servlet will now begin to synchronize objects back and forth between the service cluster
+   and KDP.
+
+## Details
+
+### Data Flow Direction
+
+It might be a bit confusing at first: The `PublishedResource` CRD describes the world from the
+standpoint of a service owner, i.e. a person or team that owns a Kubernetes cluster and is tasked
+with making their CRDs available in kcp (i.e. "publish" them).
+
+However the actual data flow later will work in the opposite direction: users creating objects inside
+their kcp workspaces serve as the source of truth. From there they are synced down to the service
+cluster, which is doing the projection of the `PublishedResource` _in reverse_.
+
+Of course additional, auxiliary (related) objects could originate on the service cluster. For example
+if you create a Certificate object in a kcp workspace and it's synced down, cert-manager will then
+acquire the certificate and create a Kubernetes `Secret`, which will have to be synced back up (into
+a kcp workspace, where the certificate originated from). So the source of truth can also be, for
+auxiliary resources, on the service cluster.
+
+### Servlet Naming
+
+Each Servlet must have a name, like "nora" or "oskar". The FQ name for a Servlet is
+`<servletname>.<apigroup>`, so if the user in KDP had created a new `APIExport` named
+`databases.examplecorp`, the name of the Servlet that serves this Service (sic) could be
+`nora.databases.examplecorp`.
+
+### Uniqueness
+
+A single `APIExport` in kcp must only be processed by exactly 1 Servlet. There is currently no
+mechanism planned to subdivide an `APIExport` into shards, where multiple service clusters (and
+therefore multiple Servlets) could process each shard.
+
+Later the Servlet might be extended with Label Selectors, alternatively they might also "claim" any
+object by annotating it in the kcp workspace. These things are not yet worked out, so for now we have
+this 1:1 restriction.
+
+Servlets make use of leader election, so it's perfectly fine to have multiple Servlet replicas, as
+long as only one them is leader and actually doing work.
+
+### kcp-awareness
+
+controller-runtime can be used in a "kcp-aware" mode, where the cache, clients, mappers etc. are
+aware of the workspace information. This however is neither well tested upstream and the code would
+require shard-admin permissions to behave like this work regular kcp workspaces. The controller-runtime
+fork's kcp-awareness is really more geared towards working in virtual workspaces.
+
+Because of this the Servlet needs to get a kubeconfig to kcp that already points to the `APIExport`'s
+workspace (i.e. the `server` URL already contains a `/clusters/root:myorg` path). The basic
+controllers in the Servlet then treat this as a plain ol', regular Kubernetes cluster
+(no kcp-awareness).
+
+To this end, the Servlet will, upon startup, try to access the `cluster` object in the target
+workspace. This is to resolve the cluster name (e.g. `root:myorg`) into a logicalcluster name (e.g.
+`gibd3r1sh`). The Servlet has to know which logicalcluster the target workspace represents in order
+to query resources properly.
+
+Only the controllers that are later responsible for interacting with the virtual workspace are
+kcp-aware. They have to be in order to know what workspace a resource is living in.
+
+### PublishedResources
+
+A `PublishedResource` describes which CRD should be made available inside kcp. The CRD name can be
+projected (i.e. renamed), so a `kubermatic.k8c.io/v1 Cluster` can become a
+`cloud.examplecorp/v1 KubernetesCluster`.
+
+In addition to projecting (mapping) the GVK, the `PublishedResource` also contains optional naming
+rules, which influence how the local objects that the Servlet is creating are named.
+
+As a single Servlet serves a single service, the API group used in kcp is the same for all
+`PublishedResources`. It's the API group configured in the `APIExport` inside the platform (created
+in step 1 in the overview above).
+
+To prevent chaos, `PublishedResources` are immutable: handling the case that a PR first wants to
+publish `kubermatic.k8c.io/v1 Cluster` and then suddenly `kubermatic.k8c.io/v1 User` resources would
+mean to re-sync and cleanup everything in all affected kcp workspaces. The Servlet would need to be
+able to delete and recreate objects to follow this GVK change, which is a level of complexity we
+simply do not want to deal with at this point in time. Also, `APIResourceSchemas` are immutable
+themselves.
+
+More information is available in the [Publishing Resources][publish-resources.md] guide.
+
+### APIExports
+
+An `APIExport` in kcp combines multiple `APIResourceSchemas` (ARS). Each ARS is created based on a
+`PublishedResource` in the service cluster.
+
+To prevent data loss, ARS are never removed from an `APIExport`. We simply do not have enough
+experience to really know what happens when an ARS would suddenly become unavailable. To prevent
+damage and confusion, the Servlet will only ever add new ARS to the one `APIExport` it manages.
+
+## Controllers
+
+The Servlet consists of a number of independent controllers.
+
+### apiexport
+
+This controller aggregates the `PublishedResources` and manages a single `APIExport` in KDP.
+
+### apiresourceschema
+
+This controller takes `PublishedResources`, projects and converts them and creates `APIResourceSchemas`
+in KDP.
+
+### syncmanager
+
+This controller watches the `APIExport` and waits for the virtual workspace to become available. It
+also watches all `PublishedResources` (PRs) and reconciles when any of them is changed (they are
+immutable, but the controller is still reacting to any events on them).
+
+The controller will then setup a controller-runtime `Cluster` abstraction for the virtual workspace
+and then start many `sync` controllers (one for each `PublishedResource`). Whenever PRs change, the
+syncmanager will make sure that the correct set of `sync` controller is running.
+
+### sync
+
+This is where the meat and potatoes happen. The sync controller is started for a single
+`PublishedResource` and is responsible for synchronizing all objects for that resource between the
+local service cluster and kcp.
+
+The `sync` controller was written to handle a single `PublishedResource` so that it does not have to
+deal with dynamically registering/stopping watches on its own. Instead the sync controller can be
+written as more or less "normal" controller-runtime controller.
diff --git a/docs/consuming-services.md b/docs/consuming-services.md
@@ -0,0 +1,76 @@
+# Consuming Services
+
+This document describes how to use (consume) services offered by a Servlet.
+
+## Background
+
+A "service" defines a unique Kubernetes API Group and offers a number of resources (types) to
+use. A service could offer certificate management, databases, cloud infrastructure or any other set
+of Kubernetes resources.
+
+Services are provided by service owners, who run their own Kubernetes clusters and take care of the
+maintenance and scaling tasks for the workload provisioned by all users of the service(s) they
+offer.
+
+A Service provided by a Servlet should not be confused with a Kubernetes Service. Internally, a
+"Servlet Service" is ultimately translated into a kcp `APIExport` with a number of
+`APIResourceSchemas` (which are more or less equivalent to CRDs).
+
+## Consuming a Service
+
+To consume a service (or to make use of an `APIExport`) you have to create an `APIBinding` object
+in the kcp workspace where the servie should be used. This section assumes that you are familiar
+with kcp on the command line and have the kcp kubectl plugin installed.
+
+First you need to get the kubeconfig for accessing your kcp workspaces. Once you have set your
+kubeconfig up, make sure you're in the correct namespace by using
+`kubectl ws <path to your workspace>`. Use `kubectl ws .` if you're unsure where you're at.
+
+To enable a Service, use `kcp bind apiexport` and specify the path to and name of the `APIExport`.
+
+```bash
+# kubectl kcp bind apiexport <path to KDP Service>:<API Group of the Service>
+kubectl kcp bind apiexport :root:my-org:my.fancy.api
+```
+
+Without the plugin, you can create an `APIBinding` manually, simply `kubectl apply` this:
+
+```yaml
+apiVersion: apis.kcp.io/v1alpha1
+kind: APIBinding
+metadata:
+  name: my.fancy.api
+spec:
+  reference:
+    export:
+      name: my.fancy.api
+      path: root:my-org
+```
+
+Shortly after, the new API will be available in the workspace. Check via `kubectl api-resources`.
+You can now create objects for types in that API group to your liking and they will be synced and
+processed behind the scenes.
+
+Note that a Service often has related resources, often Secrets and ConfigMaps. You must explicitly
+allow the Service to access these in your workspace and this means editing/patching the `APIBinding`
+object (the kcp kubectl plugin currently has no support for managing permission claims). For each of
+the claimed resources, you have to accept or reject them:
+
+```yaml
+spec:
+  permissionClaims:
+    # Nearly all Servlets require access to namespaces, rejecting this will
+    # most likely break the Service, even more than rejecting any other claim.
+    - all: true
+      resources: namespaces
+      state: Accepted
+    - all: true
+      resources: secrets
+      state: Accepted # or Rejected
+```
+
+Rejecting a claim will severely impact a Service, if not even break it. Consult with the Service's
+documentation or the service owner if rejecting a claim is supported.
+
+When you _change into_ (`kubctl ws …`) a different workspace, kubectl will inform you if there are
+outstanding permission claims that you need to accept or reject.
diff --git a/docs/getting-started.md b/docs/getting-started.md
@@ -0,0 +1,142 @@
+# Getting Started with the Servlet
+
+All that is necessary to run the Servlet is a running Kubernetes cluster (for testing you can use
+[kind][kind]) [kcp][kcp] installation.
+
+## Prerequisites
+
+- A running Kubernetes cluster to run the Servlet in.
+- A running kcp installation as the source of truth.
+- A kubeconfig with admin or comparable permissions in a specific kcp workspace.
+
+## APIExport Setup
+
+Before installing the Servlet it is necessary to create an `APIExport` on kcp. The `APIExport` should
+be empty, because it is updated later by the Servlet, but it defines the new API group we're
+introducing. An example file could look like this:
+
+```yaml
+apiVersion: apis.kcp.io/v1alpha1
+kind: APIExport
+metadata:
+  name: test.example.com
+spec: {}
+```
+
+Create a file with a similar content (you most likely want to change the name, as that is the API
+group under which your published resources will be made available) and create it in a kcp workspace
+of your choice:
+
+```sh
+# use the kcp kubeconfig
+$ export KUBECONFIG=/path/to/kcp.kubeconfig
+
+# nativagate to the workspace wher the APIExport should exist
+$ kubectl ws :workspace:you:want:to:create:it
+
+# create it
+$ kubectl create --filename apiexport.yaml
+apiexport/test.example.com created
+```
+
+## Servlet Installation
+
+The Servlet can be installed into any namespace, but in our example we are going with `k8c-system`.
+It doesn't necessarily have to live in the same Kubernetes cluster where it is synchronizing data
+to, but that is the common setup. Ultimately the Servlet synchronizes data between two kube
+endpoints.
+
+Now that the `APIExport` is created, switch to the Kubernetes cluster from which you wish to
+[publish resources](publish-resources.md). You will need to ensure that a kubeconfig with access to
+the kcp workspace that the `APIExport` has been created in is stored as a `Secret` on this cluster.
+Make sure that the kubeconfig points to the right workspace (not necessarily the `root` workspace).
+
+This can be done via a command like this:
+
+```sh
+$ kubectl create secret generic kcp-kubeconfig \
+  --namespace k8c-system \
+  --from-file "kubeconfig=admin.kubeconfig"
+```
+
+The Servlet is shipped as a Helm chart and to install it, the next step is preparing a `values.yaml`
+file for the Servlet Helm chart. We need to pass the target `APIExport`, a name for the Servlet
+itself and a reference to the kubeconfig secret we just created.
+
+```yaml
+servlet:
+  # Required: the name of the APIExport in kcp that this Servlet is supposed to serve.
+  apiExportName: test.example.com
+
+  # Required: this Servlet's public name, will be shown in kcp, purely for informational purposes.
+  servletName: unique-test
+
+  # Required: Name of the Kubernetes Secret that contains a "kubeconfig" key, with the kubeconfig
+  # provided by kcp to access it.
+  platformKubeconfig: kcp-kubeconfig
+
+  # Create additional RBAC on the service cluster. These rules depend somewhat on the Servlet
+  # configuration, but the following two rules are very common. If you configure the Servlet to
+  # only work with cluster-scoped objects, you do not need to grant it permissions to create
+  # namespaces, for example.
+  rbac:
+    createClusterRole: true
+    rules:
+      # in order to create APIResourceSchemas
+      - apiGroups:
+          - apiextensions.k8s.io
+        resources:
+          - customresourcedefinitions
+        verbs:
+          - get
+          - list
+          - watch
+      # so copies of remote objects can be placed in their target namespaces
+      - apiGroups:
+          - ""
+        resources:
+          - namespaces
+        verbs:
+          - get
+          - list
+          - watch
+          - create
+```
+
+In addition, it is important to create RBAC rules for the resources you want to publish. If you want
+to publish the `Certificate` resource as created by cert-manager, you will need to append the
+following ruleset:
+
+```yaml
+      # so we can manage certificates
+      - apiGroups:
+          - cert-manager.io
+        resources:
+          - certificates
+        verbs:
+          - '*'
+```
+
+Once this `values.yaml` file is prepared, install a recent development build of the Servlet:
+
+```sh
+helm install servlet oci://quay.io/kubermatic/helm-charts/kdp-servlet --version 9.9.9-9fc9a430d95f95f4b2210f91ef67b3ec153b5cab -f values.yaml -n k8c-system
+```
+
+Two `servlet` Pods should start in the `k8c-system` namespace. If they crash you will need to
+identify the reason from container logs. A possible issue is that the provided kubeconfig does not
+have permissions against the target kcp workspace.
+
+## Publish Resources
+
+Once the Servlet Pods are up and running, you should be able to follow the
+[Publishing Resources](publish-resources.md) guide.
+
+## Consume Service
+
+Once resources have been published through the Servlet, they can be consumed on the kcp side (i.e.
+objects on kcp will be synced back and forth with the service cluster). Follow the
+guide to [consuming services](consuming-services.md).
+
+[kind]: https://github.com/kubernetes-sigs/kind
+[kcp]: https://kcp.io
diff --git a/docs/publish-resources.md b/docs/publish-resources.md