Kubernetes troubleshooting
Cluster components
kubectl get componentstatus
Network
netshoot
A network trouble-shooting swiss-army container
Usage
Run an ephemeral container in an existing pod:
kubectl -n headscale debug headscale-6646bf6ffd-tpzl7 -it --image=nicolaka/netshoot
Run an ephemeral container in an existing pod, and attach to process namespace of running container, to access the process list i.e.:
kubectl -n headscale debug headscale-6646bf6ffd-tpzl7 -it --image=nicolaka/netshoot --target=headscale
Create dedicated namespace and run netshoot:
kubectl create namespace tmp
kubectl -n tmp run tmp-shell --rm -i --tty --image nicolaka/netshoot
kubectl -n tmp run netshoot-tmp --attach=true --rm -i --tty --image nicolaka/netshoot
Exec into already running netshoot container:
kubectl exec -it netshoot-tmp -- sh
Delete evicted pods
After DiskPressure
happened due to a full disk, hundreds of pods got evicted but still showed up after
DiskPressure
recovered.
Delete all evicted pods with:
kubectl get pods --all-namespaces -ojson | jq -r '.items[] | \
select(.status.reason!=null) | select(.status.reason | \
contains("Evicted")) | .metadata.name + " " + .metadata.namespace' | \
xargs -n2 -l bash -c 'kubectl delete pods $0 --namespace=$1'