-
Notifications
You must be signed in to change notification settings - Fork 9
add skip 001-spin-kube-doctor.md #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,218 @@ | ||
| # SKIP 001 - Adding `spin kube doctor` subcommand | ||
|
|
||
| Summary: | ||
|
|
||
| This proposal adds a command to `spin kube` plugin to improve the user experience of troubleshooting. To run SpinKube on a given Kubernetes cluster needs a specific version of containerd as well as some pre-requisites such as cert-manager. This command runs all those checks programmatically against the cluster and provide details of what prerequisites are missing and optionally fix them automatically. | ||
|
|
||
| Owner: Rajat Jindal <[email protected]> | ||
|
|
||
| Impacted Projects: | ||
|
|
||
| - [ ] spin-operator | ||
| - [X] `spin kube` plugin | ||
| - [ ] runtime-class-manager | ||
| - [ ] containerd-shim-spin | ||
| - [ ] Governance | ||
| - [ ] Creates a new project | ||
|
|
||
| Created: May 29th 2024 | ||
|
|
||
| Updated: May 29th 2024 | ||
|
|
||
| ## Background | ||
|
|
||
| There are a bunch of Kubernetes providers with subtle differences in how they are configured. They may be [using the old version of containerd](https://github.com/spinkube/spin-plugin-kube/issues/80) by default or [may not be using containerd as default runtime](https://github.com/spinkube/documentation/pull/161) at all. | ||
|
|
||
| In addition to that we have a few prerequisites for SpinKube to work correctly, such as cert-manager, containerd-shim-spin. | ||
|
|
||
| Back and forth on asking this information from the user can consume a significant amount of time for both the user and the maintainer trying to help. | ||
|
|
||
| ## Proposal | ||
|
|
||
| I am proposing to add a subcommand to the `spin kube` plugin that takes care of verifying the details of the cluster and provides an overview of what is configured correctly and what is missing. | ||
|
|
||
| One immediate usecase is when a user creates a ticket about the SpinKube not working on their cluster, we can point them to run this tool and share the output in the ticket for faster resolutions. | ||
|
|
||
| `spin kube doctor` will have the concept of reusable and configurable checks. Each check will have a corresponding function that abstracts the details of how that check is performed: | ||
|
|
||
|
|
||
| ``` | ||
| type CheckFn func(ctx context.Context, k Provider, check Check) (Status, error) | ||
| ``` | ||
|
|
||
| An example of how to check if a crd is installed: | ||
|
|
||
| ``` | ||
| var isCrdInstalled = func(ctx context.Context, k provider.Provider, check provider.Check) (provider.Status, error) { | ||
| _, err := k.DynamicClient().Resource(schema.GroupVersionResource{ | ||
| Group: "apiextensions.k8s.io", | ||
| Version: "v1", | ||
| Resource: "customresourcedefinitions", | ||
| }).Get(ctx, check.ResourceName, metav1.GetOptions{}) | ||
| if err != nil { | ||
| if errors.IsNotFound(err) { | ||
| return provider.Status{ | ||
| Name: check.Name, | ||
| Ok: false, | ||
| }, nil | ||
| } | ||
|
|
||
| return provider.Status{ | ||
| Name: check.Name, | ||
| Ok: false, | ||
| HowToFix: check.HowToFix, | ||
| }, err | ||
| } | ||
|
|
||
| return provider.Status{ | ||
| Name: check.Name, | ||
| Ok: true, | ||
| }, nil | ||
| } | ||
|
|
||
| ``` | ||
|
|
||
| We can then have a default set of checks that can be performed on the cluster with the possibility of customizing them if required. | ||
|
|
||
| ### how multiple providers are supported | ||
|
|
||
| Because these checks will be using Kubernetes API to perform checks, they should work on most of the Kubernetes distribution providers by default. However, in case we need to override the checks for a given distribution, we can do so by implementing the following interface for that provider (specifically the GetCheckOverride function): | ||
|
|
||
| ``` | ||
| type Provider interface { | ||
| Name() string | ||
| Client() kubernetes.Interface | ||
| DynamicClient() dynamic.Interface | ||
| Status(ctx context.Context) ([]Status, error) | ||
| GetCheckOverride(ctx context.Context, check Check) CheckFn | ||
| } | ||
|
|
||
| ``` | ||
|
|
||
| An example of this to adjust the check to support k3d is shown below: | ||
|
|
||
| ``` | ||
| func (k *k3d) GetCheckOverride(ctx context.Context, check provider.Check) provider.CheckFn { | ||
| switch check.Type { | ||
| case checks.CheckBinaryInstalledOnNodes: | ||
| return binaryVersionCheck | ||
| } | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
| var binaryVersionCheck = func(ctx context.Context, k provider.Provider, check provider.Check) (provider.Status, error) { | ||
| return checks.ExecOnEachNodeFn(ctx, k, check, []string{"/host/bin/containerd-shim-spin-v2"}, []string{"-v"}) | ||
| } | ||
|
|
||
| ``` | ||
|
|
||
| #### Examples | ||
|
|
||
| Here is how the output currently looks like for a few providers: | ||
|
|
||
| ##### k3d | ||
| ``` | ||
| $ go run main.go doctor --context k3d-wasm-cluster | ||
|
|
||
|
|
||
| #------------------------------------- | ||
| # Running checks for SpinKube setup | ||
| #------------------------------------- | ||
|
|
||
| ✓ Containerd version is supported | ||
| ✓ Containerd shim is installed and configured | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we also check if the spin app itself is compatible with the shim? Say trigger type isn't supported. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. that is possible but, at this moment, these checks are not specific to any particular spin app. but seems like something that can be useful. |
||
| ✗ Spin App CRD is installed | ||
| ✗ Spin App Executor CRD is installed | ||
| ✗ Cert Manager CRD is installed | ||
| ✗ Cert Manager is running | ||
| ✗ Runtime Class is installed | ||
| ✗ Spin Operator is running | ||
|
|
||
| Error: please fix above issues | ||
| exit status 1 | ||
|
|
||
| ``` | ||
|
|
||
| ##### Kind | ||
|
|
||
| ``` | ||
| $ go run main.go doctor --context kind-spin-kind | ||
|
|
||
|
|
||
| #------------------------------------- | ||
| # Running checks for SpinKube setup | ||
| #------------------------------------- | ||
|
|
||
| ✗ Containerd version is supported | ||
| -> actual version: "1.7.1" not one of the expected versions: [~1.6.26-0 ~1.7.7-0] | ||
|
|
||
| ✓ Containerd shim is installed and configured | ||
| ✗ Spin App CRD is installed | ||
| ✗ Spin App Executor CRD is installed | ||
| ✗ Cert Manager CRD is installed | ||
| ✗ Cert Manager is running | ||
| ✗ Runtime Class is installed | ||
| ✗ Spin Operator is running | ||
|
|
||
| Error: please fix above issues | ||
| exit status 1 | ||
|
|
||
|
|
||
| ``` | ||
|
|
||
| ##### Minikube | ||
|
|
||
| ``` | ||
| $ go run main.go doctor --context minikube | ||
|
|
||
|
|
||
| #------------------------------------- | ||
| # Running checks for SpinKube setup | ||
| #------------------------------------- | ||
|
|
||
| ✗ Containerd version is supported | ||
| -> found container runtime "docker://26.0.1" instead of containerd | ||
|
|
||
| ✗ Containerd shim is installed and configured | ||
| ✗ Spin App CRD is installed | ||
| ✗ Spin App Executor CRD is installed | ||
| ✗ Cert Manager CRD is installed | ||
| ✗ Cert Manager is running | ||
| ✗ Runtime Class is installed | ||
| ✗ Spin Operator is running | ||
|
|
||
| Error: please fix above issues | ||
| exit status 1 | ||
|
|
||
| ``` | ||
|
|
||
| ##### AKS | ||
|
|
||
| ``` | ||
| $ go run main.go doctor --kubeconfig ~/kubeconfig-mikkel | ||
|
|
||
|
|
||
| #------------------------------------- | ||
| # Running checks for SpinKube setup | ||
| #------------------------------------- | ||
|
|
||
| ✓ Containerd version is supported | ||
| ✓ Containerd shim is installed and configured | ||
| ✓ Spin App CRD is installed | ||
| ✓ Spin App Executor CRD is installed | ||
| ✓ Cert Manager CRD is installed | ||
| ✓ Cert Manager is running | ||
| ✓ Runtime Class is installed | ||
| ✓ Spin Operator is running | ||
|
|
||
|
|
||
| All looks good !! | ||
|
|
||
| ``` | ||
|
|
||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| An alternative way could be to document these steps and ask users to run these steps manually and provide the output. | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A list of initial checks and how you'd envision they'd work would be helpful context here - especially for things like binary version checks where many developers might not have access to things like node debugging 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that depending on the user's permissions these checks may fail. maybe we can have a check to verify permissions before actually running the checks?
also added this info with current list of checks in SKIP doc.