Tasks

Edit This Page

Troubleshoot Clusters

This doc is about cluster troubleshooting; we assume you have already ruled out your application as the root cause of the problem you are experiencing. See the application troubleshooting guide for tips on application debugging. You may also visit troubleshooting document for more information.

Listing your cluster

The first thing to debug in your cluster is if your nodes are all registered correctly.

Run

kubectl get nodes

And verify that all of the nodes you expect to see are present and that they are all in the Ready state.

Looking at logs

For now, digging deeper into the cluster requires logging into the relevant machines. Here are the locations of the relevant log files. (note that on systemd-based systems, you may need to use journalctl instead)

Master

Worker Nodes

A general overview of cluster failure modes

This is an incomplete list of things that could go wrong, and how to adjust your cluster setup to mitigate the problems.

Root causes:

Specific scenarios:

Mitigations:

Feedback