As part of the VMware Event Broker Appliance (VEBA) project, I was recently evaluating a newer version of Kubernetes (v1.21.3) and also switching the container runtime from Docker to Containerd. I figured this probably should not be that difficult, especially since we are already use Containerd within Tanzu Kubernetes Grid (TKG) which is our commercial Kubernetes (k8s) offering and that base OS is VMware Photon OS. How hard could this be, right!? (famous last words) 😂
We use kubeadm to setup K8s and read in a very basic configuration file and after following the official K8s instructions for prepping the environment to use containerd, I was surprised when I ran into the following error:
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in cri-o/containerd using crictl:
- 'crictl --runtime-endpoint /run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint /run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
Unfortunately, this lead me down a huge rat hole of troubleshooting and trying various configurations and suggestions from the Internet. Ultimately, none of the suggested solutions solved my problem. After exhausting all my options and spending more time that I would like to admit, I decided to ask in the Kubernetes Slack community to see if anyone might have an idea. There were not any specific suggestions that helped me understand the issue further but there was a question about how Containerd came to be on the system and that gave me one more thing to try.
Both Photon OS 3.0 and 4.0 ships with Containerd and after installing the desired kubeadm, kubectl and kubelet, I had wrongfully assumed that the version of Containerd would simply work.