|
1 | 1 | # Troubleshooting
|
2 | 2 |
|
| 3 | +## Troubleshooting Quick Start with Docker (CAPD) |
| 4 | + |
| 5 | +<aside class="note warning"> |
| 6 | + |
| 7 | +<h1>Warning</h1> |
| 8 | + |
| 9 | +If you've run the Quick Start before ensure that you've [cleaned up](./quick-start.md#clean-up) all resources before trying it again. Check `docker ps` to ensure there are no running containers left before beginning the Quick Start. |
| 10 | + |
| 11 | +</aside> |
| 12 | + |
| 13 | +This guide assumes you've completed the [apply the workload cluster](./quick-start.md#apply-the-workload-cluster) section of the Quick Start using Docker. |
| 14 | + |
| 15 | +When running `clusterctl describe cluster capi-quickstart` to verify the created resources, we expect the output to be similar to this (**note: this is before installing the Calico CNI**). |
| 16 | + |
| 17 | +```shell |
| 18 | +NAME READY SEVERITY REASON SINCE MESSAGE |
| 19 | +Cluster/capi-quickstart True 46m |
| 20 | +├─ClusterInfrastructure - DockerCluster/capi-quickstart-94r9d True 48m |
| 21 | +├─ControlPlane - KubeadmControlPlane/capi-quickstart-6487w True 46m |
| 22 | +│ └─3 Machines... True 47m See capi-quickstart-6487w-d5lkp, capi-quickstart-6487w-mpmkq, ... |
| 23 | +└─Workers |
| 24 | + └─MachineDeployment/capi-quickstart-md-0-d6dn6 False Warning WaitingForAvailableMachines 48m Minimum availability requires 3 replicas, current 0 available |
| 25 | + └─3 Machines... True 47m See capi-quickstart-md-0-d6dn6-584ff97cb7-kr7bj, capi-quickstart-md-0-d6dn6-584ff97cb7-s6cbf, ... |
| 26 | +``` |
| 27 | + |
| 28 | +Machines should be started, but Workers are not because Calico isn't installed yet. You should be able to see the containers running with `docker ps --all` and they should not be restarting. |
| 29 | + |
| 30 | +If you notice Machines are failing to start/restarting your output might look similar to this: |
| 31 | + |
| 32 | +```shell |
| 33 | +clusterctl describe cluster capi-quickstart |
| 34 | +NAME READY SEVERITY REASON SINCE MESSAGE |
| 35 | +Cluster/capi-quickstart False Warning ScalingUp 57s Scaling up control plane to 3 replicas (actual 2) |
| 36 | +├─ClusterInfrastructure - DockerCluster/capi-quickstart-n5w87 True 110s |
| 37 | +├─ControlPlane - KubeadmControlPlane/capi-quickstart-6587k False Warning ScalingUp 57s Scaling up control plane to 3 replicas (actual 2) |
| 38 | +│ ├─Machine/capi-quickstart-6587k-fgc6m True 81s |
| 39 | +│ └─Machine/capi-quickstart-6587k-xtvnz False Warning BootstrapFailed 52s 1 of 2 completed |
| 40 | +└─Workers |
| 41 | + └─MachineDeployment/capi-quickstart-md-0-5whtj False Warning WaitingForAvailableMachines 110s Minimum availability requires 3 replicas, current 0 available |
| 42 | + └─3 Machines... False Info Bootstrapping 77s See capi-quickstart-md-0-5whtj-5d8c9746c9-f8sw8, capi-quickstart-md-0-5whtj-5d8c9746c9-hzxc2, ... |
| 43 | +``` |
| 44 | + |
| 45 | +In the example above we can see that the Machine `capi-quickstart-6587k-xtvnz` has failed to start. The reason provided is `BootstrapFailed`. |
| 46 | + |
| 47 | +To investigate why a machine fails to start you can inspect the conditions of the objects using `clusterctl describe --show-conditions all cluster capi-quickstart`. You can get more detailed information about the status of the machines using `kubectl describe machines`. |
| 48 | + |
| 49 | +To inspect the underlying infrastructure - in this case docker containers acting as Machines - you can access the logs using `docker logs <MACHINE-NAME>`. For example: |
| 50 | + |
| 51 | +```shell |
| 52 | +docker logs capi-quickstart-6587k-xtvnz |
| 53 | +(...) |
| 54 | +Failed to create control group inotify object: Too many open files |
| 55 | +Failed to allocate manager object: Too many open files |
| 56 | +[!!!!!!] Failed to allocate manager object. |
| 57 | +Exiting PID 1... |
| 58 | +``` |
| 59 | + |
| 60 | +To resolve this specific error please read [Cluster API with Docker - "too many open files"](#cluster-api-with-docker----too-many-open-files). |
| 61 | + |
| 62 | + |
| 63 | + |
3 | 64 | ## Node bootstrap failures when using CABPK with cloud-init
|
4 | 65 |
|
5 | 66 | Failures during Node bootstrapping can have a lot of different causes. For example, Cluster API resources might be
|
@@ -52,7 +113,7 @@ provisioning might be stuck:
|
52 | 113 | * Run [docker system df](https://docs.docker.com/engine/reference/commandline/system_df/) to inspect the disk space consumed by Docker resources.
|
53 | 114 | * Run [docker system prune --volumes](https://docs.docker.com/engine/reference/commandline/system_prune/) to prune dangling images, containers, volumes and networks.
|
54 | 115 |
|
55 |
| - |
| 116 | + |
56 | 117 | ## Cluster API with Docker - "too many open files"
|
57 | 118 | When creating many nodes using Cluster API and Docker infrastructure, either by creating large Clusters or a number of small Clusters, the OS may run into inotify limits which prevent new nodes from being provisioned.
|
58 | 119 | If the error `Failed to create inotify object: Too many open files` is present in the logs of the Docker Infrastructure provider this limit is being hit.
|
|
0 commit comments