diff --git a/.gitignore b/.gitignore index fa257cb80..d60c11838 100644 --- a/.gitignore +++ b/.gitignore @@ -83,3 +83,4 @@ watchall-output* /conditions*.txt /.config +/.kube diff --git a/docs/caph/04-developers/01-development-guide.md b/docs/caph/04-developers/01-development-guide.md index f82f064fd..e4e2465b2 100644 --- a/docs/caph/04-developers/01-development-guide.md +++ b/docs/caph/04-developers/01-development-guide.md @@ -23,24 +23,32 @@ This ensures the following: - helmfile - kind (required) - kubectl (required) -- tilt (required) -- hcloud +- tilt +- hcloud cli-tool. ## Preparing Hetzner project -For more information, please see [here](/docs/caph/01-getting-started/03-preparation.md). +For more information, please see the [Hetzner project preparation](/docs/caph/01-getting-started/03-preparation.md) guide. -## Setting Tilt up +## Tilt vs pushing development container -You need to create a `.envrc` file and specify the values you need. After the `.envrc` is loaded, invoke `direnv allow` to load the environment variables in your current shell session. +You can use [Tilt](https://tilt.dev/) or a script like +[update-operator-dev-deployment.sh](/hack/update-operator-dev-deployment.sh) to install your changed +code in the management cluster. -The complete reference can be found [here](/docs/caph/04-developers/02-tilt.md). +We do not update the Tilt configuration regularly. The script may be an easier solution. + +We recommend creating a `.envrc` file and specifying the values you need. After the `.envrc` is loaded +([direnv.net](https://direnv.net/)), invoke `direnv allow` to load the environment variables in your +current shell session. + +The complete reference can be found in the [Reference of Tilt](/docs/caph/04-developers/02-tilt.md) documentation. ## Developing with Tilt ![tilt](https://syself.com/images/tilt.png) -Provider Integration development requires a lot of iteration, and the β€œbuild, tag, push, update deployment” workflow can be very tedious. Tilt makes this process much simpler by watching for updates and automatically building and deploying them. To build a kind cluster and to start Tilt, run: +To build a kind cluster and to start Tilt, run: ```shell make tilt-up @@ -68,6 +76,25 @@ To delete the registry, use `make delete-registry`. Use `make delete-mgt-cluster If you have any trouble finding the right command, you can run the `make help` command to get a list of all available make targets. +## Troubleshooting + +If you want to have a better overview about what is going on in your management cluster, then you can use the +following tools. + +```console +❯ watch ./hack/output-for-watch.sh +``` + +This script continuously shows the most important resources (capi machines, infra machines, ...) +and logs of caph and capi. Run this with your management cluster kubeconfig active. + +```console +go run github.com/guettli/check-conditions@latest all +``` + +[check-conditions](https://github.com/guettli/check-conditions) shows all unhealthy conditions of +the current cluster. You can use it in both the management and workload clusters. + ## Submitting PRs and testing Pull requests and issues are highly encouraged! For more information, please have a look at the [Contribution Guidelines](https://github.com/syself/cluster-api-provider-hetzner/blob/main/CONTRIBUTING.md) diff --git a/docs/caph/04-developers/02-tilt.md b/docs/caph/04-developers/02-tilt.md index 0e364c4ed..0186a20e1 100644 --- a/docs/caph/04-developers/02-tilt.md +++ b/docs/caph/04-developers/02-tilt.md @@ -5,7 +5,11 @@ sidebar: Reference of Tilt description: Full list of available Tilt configuration values and their description. --- -``` +We do not update the Tilt configuration regularly. The script +([update-operator-dev-deployment.sh](/hack/update-operator-dev-deployment.sh)) may be an easier +solution. + +```json "allowed_contexts": [ "kind-caph", ], diff --git a/hack/get-leading-pod.sh b/hack/get-leading-pod.sh new file mode 100755 index 000000000..e25f26475 --- /dev/null +++ b/hack/get-leading-pod.sh @@ -0,0 +1,49 @@ +#!/usr/bin/env bash + +# Copyright 2023 The Kubernetes Authors. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Bash Strict Mode: https://github.com/guettli/bash-strict-mode +trap 'echo -e "\n🀷 🚨 πŸ”₯ Warning: A command has failed. Exiting the script. Line was ($0:$LINENO): $(sed -n "${LINENO}p" "$0" 2>/dev/null || true) πŸ”₯ 🚨 🀷 "; exit 3' ERR +set -Eeuo pipefail + +if [[ $# -eq 0 ]] || [[ $# -gt 2 ]] || [[ "$1" == -* ]]; then + echo "Usage: $0 []" >&2 + exit 1 +fi + +dep="$1" + +ns="${2:-}" + +hack_dir=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) + +if [[ -z $ns ]]; then + ns=$("$hack_dir/get-namespace-of-deployment.sh" "$dep") +fi + +leases=$(kubectl get leases -n "$ns" -o yaml | + yq ".items[] | .spec.holderIdentity" | { grep -P "^${dep}-[^-]+-[^-]+_" || true; }) + +if [[ -z $leases ]]; then + echo "Error: failed to find a lease for deployment $dep in namespace $ns" + exit 1 +fi + +if [ "$(echo "$leases" | wc -l)" -gt 1 ]; then + echo "Error: Multiple leases found for deployment '$dep'" >&2 + exit 1 +fi + +echo "$leases" | cut -d_ -f1 diff --git a/hack/get-namespace-of-deployment.sh b/hack/get-namespace-of-deployment.sh new file mode 100755 index 000000000..2726bbc08 --- /dev/null +++ b/hack/get-namespace-of-deployment.sh @@ -0,0 +1,41 @@ +#!/usr/bin/env bash + +# Copyright 2023 The Kubernetes Authors. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Bash Strict Mode: https://github.com/guettli/bash-strict-mode +trap 'echo -e "\n🀷 🚨 πŸ”₯ Warning: A command has failed. Exiting the script. Line was ($0:$LINENO): $(sed -n "${LINENO}p" "$0" 2>/dev/null || true) πŸ”₯ 🚨 🀷 "; exit 3' ERR +set -Eeuo pipefail + +if [[ $# -ne 1 ]] || [[ "$1" == -* ]]; then + echo "Usage: $0 " >&2 + exit 1 +fi + +dep="$1" + +# Find the namespace (must be exactly one) +ns_candidates="$(kubectl get deploy -A -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.namespace}{"\n"}{end}' | + awk -v d="$dep" '$1==d{print $2}')" + +ns_count="$(printf '%s\n' "$ns_candidates" | sed '/^$/d' | wc -l | tr -d ' ')" +if [ "$ns_count" -eq 0 ]; then + echo "ERROR: Deployment '$dep' not found in any namespace." >&2 + exit 1 +elif [ "$ns_count" -gt 1 ]; then + echo "ERROR: Deployment '$dep' found in multiple namespaces:" >&2 + printf '%s\n' "$ns_candidates" >&2 + exit 1 +fi +printf '%s\n' "$ns_candidates" | head -n1 diff --git a/hack/output-for-watch.sh b/hack/output-for-watch.sh index 2be00f62e..972afe39b 100755 --- a/hack/output-for-watch.sh +++ b/hack/output-for-watch.sh @@ -14,6 +14,17 @@ # See the License for the specific language governing permissions and # limitations under the License. +############################################################# +# This script creates an overview of the management cluster. +# You can call it once, or continuously like this: +# watch ./hack/output-for-watch.sh +# +# You can call it from a different directory, too: +# ../cluster-api-provider-hetzner/hack/output-for-watch.sh +############################################################# + +hack_dir=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) + function print_heading() { blue='\033[0;34m' nc='\033[0m' # No Color @@ -47,7 +58,7 @@ kubectl get events -A --sort-by=lastTimestamp | grep -vP 'LeaderElection' | tail print_heading caph: -./hack/tail-controller-logs.sh +"$hack_dir"/tail-controller-logs.sh regex='^I\d\d\d\d|\ .*it may have already been deleted|\ @@ -55,8 +66,10 @@ regex='^I\d\d\d\d|\ .*failed to retrieve Spec.ProviderID|\ .*failed to patch Machine default ' -capi_ns=$(kubectl get deployments -A | grep capi-con | cut -d' ' -f1) -capi_logs=$(kubectl logs -n "$capi_ns" deployments/capi-controller-manager --since 10m | grep -vP "$(echo "$regex" | tr -d '\n')" | tail -5) +capi_ns=$("$hack_dir"/get-namespace-of-deployment.sh capi-controller-manager) +capi_pod=$("$hack_dir"/get-leading-pod.sh capi-controller-manager "$capi_ns") + +capi_logs=$(kubectl logs -n "$capi_ns" "$capi_pod" --since 10m | grep -vP "$(echo "$regex" | tr -d '\n')" | tail -5) if [ -n "$capi_logs" ]; then print_heading capi echo "$capi_logs" @@ -89,7 +102,7 @@ fi echo -./hack/get-kubeconfig-of-workload-cluster.sh +"$hack_dir"/get-kubeconfig-of-workload-cluster.sh kubeconfig_wl=".workload-cluster-kubeconfig.yaml" @@ -120,9 +133,9 @@ print_heading "workload-cluster nodes" KUBECONFIG=$kubeconfig_wl kubectl get nodes -o 'custom-columns=NAME:.metadata.name,STATUS:.status.phase,ROLES:.metadata.labels.kubernetes\.io/role,creationTimestamp:.metadata.creationTimestamp,VERSION:.status.nodeInfo.kubeletVersion,IP:.status.addresses[?(@.type=="ExternalIP")].address' if [ "$(kubectl get machine | wc -l)" -ne "$(KUBECONFIG="$kubeconfig_wl" kubectl get nodes | wc -l)" ]; then - echo "❌ Number of nodes in wl-cluster does not match number of machines in mgt-cluster" + echo "❌ Number of nodes in workload cluster does not match number of machines in management cluster" else - echo "πŸ‘Œ number of nodes in wl-cluster is equal to number of machines in mgt-cluster" + echo "πŸ‘Œ number of nodes in workload cluster is equal to number of machines in management cluster" fi rows=$(kubectl get hcloudremediation -A 2>/dev/null) diff --git a/hack/tail-controller-logs.sh b/hack/tail-controller-logs.sh index fda118cc3..d89ecb769 100755 --- a/hack/tail-controller-logs.sh +++ b/hack/tail-controller-logs.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Copyright 2023 The Kubernetes Authors. # @@ -14,14 +14,15 @@ # See the License for the specific language governing permissions and # limitations under the License. -ns=$(kubectl get deployments.apps -A | grep caph-controller-manager | cut -d' ' -f1) -pod=$(kubectl -n "$ns" get pods | grep caph-controller-manager | cut -d' ' -f1) +# Bash Strict Mode: https://github.com/guettli/bash-strict-mode +trap 'echo -e "\n🀷 🚨 πŸ”₯ Warning: A command has failed. Exiting the script. Line was ($0:$LINENO): $(sed -n "${LINENO}p" "$0" 2>/dev/null || true) πŸ”₯ 🚨 🀷 "; exit 3' ERR +set -Eeuo pipefail -if [ -z "$pod" ]; then - echo "failed to find caph-controller-manager pod" - exit 1 -fi +dep="caph-controller-manager" +hack_dir=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +ns=$("$hack_dir"/get-namespace-of-deployment.sh $dep) +pod=$("$hack_dir"/get-leading-pod.sh $dep "$ns") kubectl -n "$ns" logs "$pod" --tail 200 | - ./hack/filter-caph-controller-manager-logs.py - | + "$hack_dir"/filter-caph-controller-manager-logs.py - | tail -n 10