Skip to content

Kubernetes troubleshooting

Cluster components

kubectl get componentstatus

Network

netshoot

A network trouble-shooting swiss-army container

Usage

Run an ephemeral container in an existing pod:

kubectl -n headscale debug headscale-6646bf6ffd-tpzl7 -it --image=nicolaka/netshoot

Run an ephemeral container in an existing pod, and attach to process namespace of running container, to access the process list i.e.:

kubectl -n headscale debug headscale-6646bf6ffd-tpzl7 -it --image=nicolaka/netshoot --target=headscale

Create dedicated namespace and run netshoot:

kubectl create namespace tmp
kubectl -n tmp run tmp-shell --rm -i --tty --image nicolaka/netshoot
kubectl -n tmp run netshoot-tmp --attach=true --rm -i --tty --image nicolaka/netshoot

Exec into already running netshoot container:

kubectl exec -it netshoot-tmp -- sh

Delete evicted pods

After DiskPressure happened due to a full disk, hundreds of pods got evicted but still showed up after DiskPressure recovered. Delete all evicted pods with:

kubectl get pods --all-namespaces -ojson | jq -r '.items[] | \
  select(.status.reason!=null) | select(.status.reason | \
  contains("Evicted")) | .metadata.name + " " + .metadata.namespace' | \
  xargs -n2 -l bash -c 'kubectl delete pods $0 --namespace=$1'