2 minute read  

User Alerts

AlertnameSeverityTypeDescription
ApiServerUnreachableViaKubernetesServicecriticalshootThe Api server has been unreachable for 15 minutes via the kubernetes service in the shoot.
KubeKubeletNodeDownwarningshootThe kubelet {{ $labels.instance }} has been unavailable/unreachable for more than 1 hour. Workloads on the affected node may not be schedulable.
KubeletTooManyOpenFileDescriptorsShootwarningshootShoot-kubelet ({{ $labels.kubernetes_io_hostname }}) is using {{ $value }}% of the available file/socket descriptors. Kubelet could be under heavy load.
KubeletTooManyOpenFileDescriptorsShootcriticalshootShoot-kubelet ({{ $labels.kubernetes_io_hostname }}) is using {{ $value }}% of the available file/socket descriptors. Kubelet could be under heavy load.
KubePodPendingShootwarningshootPod {{ $labels.pod }} is stuck in "Pending" state for more than 1 hour.
KubePodNotReadyShootwarningshootPod {{ $labels.pod }} is not ready for more than 1 hour.
NodeExporterDownwarningshootThe NodeExporter has been down or unreachable from Prometheus for more than 1 hour.
K8SNodeOutOfDiskcriticalshootNode {{ $labels.node }} has run out of disk space.
K8SNodeMemoryPressurewarningshootNode {{ $labels.node }} is under memory pressure.
K8SNodeDiskPressurewarningshootNode {{ $labels.node }} is under disk pressure
VMRootfsFullcriticalshootRoot filesystem device on instance {{ $labels.instance }} is almost full.
VMConntrackTableFullcriticalshootThe nf_conntrack table is {{ $value }}% full.
VPNProbeAPIServerProxyFailedcriticalshootThe API Server proxy functionality is not working. Probably the vpn connection from an API Server pod to the vpn-shoot endpoint on the Shoot workers does not work.