This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Monitor and Troubleshoot

1 - Analyzing Node Removal and Failures

Utilize Gardener’s Monitoring and Logging to analyze removal and failures of nodes

Overview

Sometimes operators want to find out why a certain node got removed. This guide helps to identify possible causes. There are a few potential reasons why nodes can be removed:

broken node: a node becomes unhealthy and machine-controller-manager terminates it in an attempt to replace the unhealthy node with a new one
scale-down: cluster-autoscaler sees that a node is under-utilized and therefore scales down a worker pool
node rolling: configuration changes to a worker pool (or cluster) require all nodes of one or all worker pools to be rolled and thus all nodes to be replaced. Some possible changes are:
- the K8s/OS version
- changing machine types

Helpful information can be obtained by using the logging stack. See Logging Stack for how to utilize the logging information in Gardener.

Find Out Whether the Node Was `unhealthy`

Check the Node Events

A good first indication on what happened to a node can be obtained from the node’s events. Events are scraped and ingested into the logging system, so they can be found in the explore tab of Grafana (make sure to select loki as datasource) with a query like {job="event-logging"} | unpack | object="Node/<node-name>" or find any event mentioning the node in question via a broader query like {job="event-logging"}|="<node-name>".

A potential result might reveal:

{"_entry":"Node ip-10-55-138-185.eu-central-1.compute.internal status is now: NodeNotReady","count":1,"firstTimestamp":"2023-04-05T12:02:08Z","lastTimestamp":"2023-04-05T12:02:08Z","namespace":"default","object":"Node/ip-10-55-138-185.eu-central-1.compute.internal","origin":"shoot","reason":"NodeNotReady","source":"node-controller","type":"Normal"}

Check machine-controller-manager Logs

If a node was getting unhealthy, the last conditions can be found in the logs of the machine-controller-manager by using a query like {pod_name=~"machine-controller-manager.*"}|="<node-name>".

Caveat: every node resource is backed by a corresponding machine resource managed by machine-controller-manager. Usually two corresponding node and machine resources have the same name with the exception of AWS. Here you first need to find with the above query the corresponding machine name, typically via a log like this

2023-04-05 12:02:08 {"log":"Conditions of Machine \"shoot--demo--cluster-pool-z1-6dffc-jh4z4\" with providerID \"aws:///eu-central-1/i-0a6ad1ca4c2e615dc\" and backing node \"ip-10-55-138-185.eu-central-1.compute.internal\" are changing","pid":"1","severity":"INFO","source":"machine_util.go:629"}

This reveals that node ip-10-55-138-185.eu-central-1.compute.internal is backed by machine shoot--demo--cluster-pool-z1-6dffc-jh4z4. On infrastructures other than AWS you can omit this step.

With the machine name at hand, now search for log entries with {pod_name=~"machine-controller-manager.*"}|="<machine-name>". In case the node had failing conditions, you’d find logs like this:

2023-04-05 12:02:08 {"log":"Machine shoot--demo--cluster-pool-z1-6dffc-jh4z4 is unhealthy - changing MachineState to Unknown. Node conditions: [{Type:ClusterNetworkProblem Status:False LastHeartbeatTime:2023-04-05 11:58:39 +0000 UTC LastTransitionTime:2023-03-23 11:59:29 +0000 UTC Reason:NoNetworkProblems Message:no cluster network problems} ... {Type:Ready Status:Unknown LastHeartbeatTime:2023-04-05 11:55:27 +0000 UTC LastTransitionTime:2023-04-05 12:02:07 +0000 UTC Reason:NodeStatusUnknown Message:Kubelet stopped posting node status.}]","pid":"1","severity":"WARN","source":"machine_util.go:637"}

In the example above, the reason for an unhealthy node was that kubelet failed to renew its heartbeat. Typical reasons would be either a broken VM (that couldn’t execute kubelet anymore) or a broken network. Note that some VM terminations performed by the infrastructure provider are actually expected (e.g., scheduled events on AWS).

In both cases, the infrastructure provider might be able to provide more information on particular VM or network failures.

Whatever the failure condition might have been, if a node gets unhealthy, it will be terminated by machine-controller-manager after the machineHealthTimeout has elapsed (this parameter can be configured in your shoot spec).

Check the Node Logs

For each node the kernel and kubelet logs, as well as a few others, are scraped and can be queried with this query {nodename="<node-name>"} This might reveal OS specific issues or, in the absence of any logs (e.g., after the node went unhealthy), might indicate a network disruption or sudden VM termination. Note that some VM terminations performed by the infrastructure provider are actually expected (e.g., scheduled events on AWS).

Infrastructure providers might be able to provide more information on particular VM failures in such cases.

Check the Network Problem Detector Dashboard

If your Gardener installation utilizes gardener-extension-shoot-networking-problemdetector, you can check the dashboard named “Network Problem Detector” in Grafana for hints on network issues on the node of interest.

Scale-Down

In general, scale-downs are managed by the cluster-autoscaler, its logs can be found with the query {container_name="cluster-autoscaler"}. Attempts to remove a node can be found with the query {container_name="cluster-autoscaler"}|="Scale-down: removing empty node"

If a scale-down has caused disruptions in your workload, consider protecting your workload by adding PodDisruptionBudgets (see the autoscaler FAQ for more options).

Node Rolling

Node rolling can be caused by, e.g.:

change of the K8s minor version of the cluster or a worker pool
change of the OS version of the cluster or a worker pool
change of the disk size/type or machine size/type of a worker pool
change of node labels

Changes like the above are done by altering the shoot specification and thus are recorded in the external auditlog system that is configured for the garden cluster.

2 - Get a Shell to a Gardener Shoot Worker Node

Describes the methods for getting shell access to worker nodes

Overview

To troubleshoot certain problems in a Kubernetes cluster, operators need access to the host of the Kubernetes node. This can be required if a node misbehaves or fails to join the cluster in the first place.

With access to the host, it is for instance possible to check the kubelet logs and interact with common tools such as systemctl and journalctl.

The first section of this guide explores options to get a shell to the node of a Gardener Kubernetes cluster. The options described in the second section do not rely on Kubernetes capabilities to get shell access to a node and thus can also be used if an instance failed to join the cluster.

This guide only covers how to get access to the host, but does not cover troubleshooting methods.

Get a Shell to an Operational Cluster Node

The following describes four different approaches to get a shell to an operational Shoot worker node. As a prerequisite to troubleshooting a Kubernetes node, the node must have joined the cluster successfully and be able to run a pod. All of the described approaches involve scheduling a pod with root permissions and mounting the root filesystem.

Gardener Dashboard

Prerequisite: the terminal feature is configured for the Gardener dashboard.

Navigate to the cluster overview page and find the Terminal in the Access tile.

Select the target Cluster (Garden, Seed / Control Plane, Shoot cluster) depending on the requirements and access rights (only certain users have access to the Seed Control Plane).

To open the terminal configuration, interact with the top right-hand corner of the screen.

Set the Terminal Runtime to “Privileged”. Also, specify the target node from the drop-down menu.

Result

The Dashboard then schedules a pod and opens a shell session to the node.

To get access to the common binaries installed on the host, prefix the command with chroot /hostroot. Note that the path depends on where the root path is mounted in the container. In the default image used by the Dashboard, it is under /hostroot.

Gardener Ops Toolbelt

Prerequisite: kubectl is available.

The Gardener ops-toolbelt can be used as a convenient way to deploy a root pod to a node. The pod uses an image that is bundled with a bunch of useful troubleshooting tools. This is also the same image that is used by default when using the Gardener Dashboard terminal feature as described in the previous section.

The easiest way to use the Gardener ops-toolbelt is to execute the ops-pod script in the hacks folder. To get root shell access to a node, execute the aforementioned script by supplying the target node name as an argument:

<path-to-ops-toolbelt-repo>/hacks/ops-pod <target-node>

Custom Root Pod

Alternatively, a pod can be assigned to a target node and a shell can be opened via standard Kubernetes means. To enable root access to the node, the pod specification requires proper securityContext and volume properties.

For instance, you can use the following pod manifest, after changing with the name of the node you want this pod attached to:

apiVersion: v1
kind: Pod
metadata:
  name: privileged-pod
  namespace: default
spec:
  nodeSelector:
    kubernetes.io/hostname: <target-node-name>
  containers:
  - name: busybox
    image: busybox
    stdin: true
    securityContext:
      privileged: true
    volumeMounts:
    - name: host-root-volume
      mountPath: /host
      readOnly: true
  volumes:
  - name: host-root-volume
    hostPath:
      path: /
  hostNetwork: true
  hostPID: true
  restartPolicy: Never

SSH Access to a Node That Failed to Join the Cluster

This section explores two options that can be used to get SSH access to a node that failed to join the cluster. As it is not possible to schedule a pod on the node, the Kubernetes-based methods explored so far cannot be used in this scenario.

Additionally, Gardener typically provisions worker instances in a private subnet of the VPC, hence - there is no public IP address that could be used for direct SSH access.

For this scenario, cloud providers typically have extensive documentation (e.g., AWS & GCP and in some cases tooling support). However, these approaches are mostly cloud provider specific, require interaction via their CLI and API or sometimes the installation of a cloud provider specific agent on the node.

Alternatively, gardenctl can be used providing a cloud provider agnostic and out-of-the-box support to get ssh access to an instance in a private subnet. Currently gardenctl supports AWS, GCP, Openstack, Azure and Alibaba Cloud.

Identifying the Problematic Instance

First, the problematic instance has to be identified. In Gardener, worker pools can be created in different cloud provider regions, zones, and accounts.

The instance would typically show up as successfully started / running in the cloud provider dashboard or API and it is not immediately obvious which one has a problem. Instead, we can use the Gardener API / CRDs to obtain the faulty instance identifier in a cloud-agnostic way.

Gardener uses the Machine Controller Manager to create the Shoot worker nodes. For each worker node, the Machine Controller Manager creates a Machine CRD in the Shoot namespace in the respective Seed cluster. Usually the problematic instance can be identified, as the respective Machine CRD has status pending.

The instance / node name can be obtained from the Machine .status field:

kubectl get machine <machine-name> -o json | jq -r .status.node

This is all the information needed to go ahead and use gardenctl ssh to get a shell to the node. In addition, the used cloud provider, the specific identifier of the instance, and the instance region can be identified from the Machine CRD.

Get the identifier of the instance via:

kubectl get machine <machine-name> -o json | jq -r .spec.providerID // e.g aws:///eu-north-1/i-069733c435bdb4640

The identifier shows that the instance belongs to the cloud provider aws with the ec2 instance-id i-069733c435bdb4640 in region eu-north-1.

To get more information about the instance, check out the MachineClass (e.g., AWSMachineClass) that is associated with each Machine CRD in the Shoot namespace of the Seed cluster.

The AWSMachineClass contains the machine image (ami), machine-type, iam information, network-interfaces, subnets, security groups and attached volumes.

Of course, the information can also be used to get the instance with the cloud provider CLI / API.

gardenctl ssh

Using the node name of the problematic instance, we can use the gardenctl ssh command to get SSH access to the cloud provider instance via an automatically set up bastion host. gardenctl takes care of spinning up the bastion instance, setting up the SSH keys, ports and security groups and opens a root shell on the target instance. After the SSH session has ended, gardenctl deletes the created cloud provider resources.

Use the following commands:

First, target a Garden cluster containing all the Shoot definitions.

gardenctl target garden <target-garden>

Target an available Shoot by name. This sets up the context, configures the kubeconfig file of the Shoot cluster and downloads the cloud provider credentials. Subsequent commands will execute in this context.

gardenctl target shoot <target-shoot>

This uses the cloud provider credentials to spin up the bastion and to open a shell on the target instance.

gardenctl ssh <target-node>

SSH with a Manually Created Bastion on AWS

In case you are not using gardenctl or want to control the bastion instance yourself, you can also manually set it up. The steps described here are generally the same as those used by gardenctl internally. Despite some cloud provider specifics, they can be generalized to the following list:

Open port 22 on the target instance.
Create an instance / VM in a public subnet (the bastion instance needs to have a public IP address).
Set-up security groups and roles, and open port 22 for the bastion instance.

The following diagram shows an overview of how the SSH access to the target instance works:

This guide demonstrates the setup of a bastion on AWS.

Prerequisites:

The AWS CLI is set up.
Obtain target instance-id (see Identifying the Problematic Instance).
Obtain the VPC ID the Shoot resources are created in. This can be found in the Infrastructure CRD in the Shoot namespace in the Seed.
Make sure that port 22 on the target instance is open (default for Gardener deployed instances).
- Extract security group via:
```
aws ec2 describe-instances --instance-ids <instance-id>
```
- Check for rule that allows inbound connections on port 22:
```
aws ec2 describe-security-groups --group-ids=<security-group-id>
```
- If not available, create the rule with the following comamnd:
```
aws ec2 authorize-security-group-ingress --group-id <security-group-id>  --protocol tcp --port 22 --cidr 0.0.0.0/0
```

Create the Bastion Security Group

The common name of the security group is <shoot-name>-bsg. Create the security group:

aws ec2 create-security-group --group-name <bastion-security-group-name>  --description ssh-access --vpc-id <VPC-ID>

Optionally, create identifying tags for the security group:

aws ec2 create-tags --resources <bastion-security-group-id> --tags Key=component,Value=<tag>

Create a permission in the bastion security group that allows ssh access on port 22:

aws ec2 authorize-security-group-ingress --group-id <bastion-security-group-id>  --protocol tcp --port 22 --cidr 0.0.0.0/0

Create an IAM role for the bastion instance with the name <shoot-name>-bastions:

aws iam create-role --role-name <shoot-name>-bastions

The content should be:

{
"Version": "2012-10-17",
"Statement": [
    {
        "Effect": "Allow",
        "Action": [
            "ec2:DescribeRegions"
        ],
        "Resource": [
            "*"
        ]
    }
]
}

Create the instance profile and name it <shoot-name>-bastions:

aws iam create-instance-profile --instance-profile-name <name>

Add the created role to the instance profile:

aws iam add-role-to-instance-profile --instance-profile-name <instance-profile-name> --role-name <role-name>

Create the Bastion Instance

Next, in order to be able to ssh into the bastion instance, the instance has to be set up with a user with a public ssh key. Create a user gardener that has the same Gardener-generated public ssh key as the target instance.

First, we need to get the public part of the Shoot ssh-key. The ssh-key is stored in a secret in the the project namespace in the Garden cluster. The name is: <shoot-name>-ssh-publickey. Get the key via:

kubectl get secret aws-gvisor.ssh-keypair -o json | jq -r .data.\"id_rsa.pub\"

A script handed over as user-data to the bastion ec2 instance, can be used to create the gardener user and add the ssh-key. For your convenience, you can use the following script to generate the user-data.

#!/bin/bash -eu
saveUserDataFile () {
  ssh_key=$1

cat > gardener-bastion-userdata.sh <<EOF
#!/bin/bash -eu
id gardener || useradd gardener -mU
mkdir -p /home/gardener/.ssh
echo "$ssh_key" > /home/gardener/.ssh/authorized_keys
chown gardener:gardener /home/gardener/.ssh/authorized_keys
echo "gardener ALL=(ALL) NOPASSWD:ALL" >/etc/sudoers.d/99-gardener-user
EOF
}


if [ -p /dev/stdin ]; then
    read -r input
    cat | saveUserDataFile "$input"
else
    pbpaste | saveUserDataFile "$input"
fi

Use the script by handing-over the public ssh-key of the Shoot cluster:

kubectl get secret aws-gvisor.ssh-keypair -o json | jq -r .data.\"id_rsa.pub\" | ./generate-userdata.sh

This generates a file called gardener-bastion-userdata.sh in the same directory containing the user-data.

The following information is needed to create the bastion instance:

bastion-IAM-instance-profile-name - Use the created instance profile with the name <shoot-name>-bastions

image-id - It is possible to use the same image-id as the one used for the target instance (or any other image). Has cloud provider specific format (AWS: ami).

ssh-public-key-name

- This is the ssh key pair already created in the Shoot's cloud provider account by Gardener during the `Infrastructure` CRD reconciliation.
- The name is usually: `<shoot-name>-ssh-publickey`

subnet-id - Choose a subnet that is attached to an Internet Gateway and NAT Gateway (bastion instance must have a public IP). - The Gardener created public subnet with the name <shoot-name>-public-utility-<xy> can be used. Please check the created subnets with the cloud provider.

bastion-security-group-id - Use the id of the created bastion security group.

file-path-to-userdata - Use the filepath to the user-data file generated in the previous step.

bastion-instance-name
- Optionaly, you can tag the instance.
- Usually <shoot-name>-bastions

Create the bastion instance via:

ec2 run-instances --iam-instance-profile Name=<bastion-IAM-instance-profile-name> --image-id <image-id>  --count 1 --instance-type t3.nano --key-name <ssh-public-key-name>  --security-group-ids <bastion-security-group-id> --subnet-id <subnet-id> --associate-public-ip-address --user-data <file-path-to-userdata> --tag-specifications ResourceType=instance,Tags=[{Key=Name,Value=<bastion-instance-name>},{Key=component,Value=<mytag>}] ResourceType=volume,Tags=[{Key=component,Value=<mytag>}]"

Capture the instance-id from the response and wait until the ec2 instance is running and has a public IP address.

Connecting to the Target Instance

Save the private key of the ssh-key-pair in a temporary local file for later use:

umask 077

kubectl get secret <shoot-name>.ssh-keypair -o json | jq -r .data.\"id_rsa\" | base64 -d > id_rsa.key

Use the private ssh key to ssh into the bastion instance:

ssh -i <path-to-private-key> gardener@<public-bastion-instance-ip>

If that works, connect from your local terminal to the target instance via the bastion:

ssh  -i <path-to-private-key> -o ProxyCommand="ssh -W %h:%p -i <private-key> -o IdentitiesOnly=yes -o StrictHostKeyChecking=no gardener@<public-ip-bastion>" gardener@<private-ip-target-instance> -o IdentitiesOnly=yes -o StrictHostKeyChecking=no

Cleanup

Do not forget to cleanup the created resources. Otherwise Gardener will eventually fail to delete the Shoot.

3 - How to Debug a Pod

Your pod doesn’t run as expected. Are there any log files? Where? How could I debug a pod?

Introduction

Kubernetes offers powerful options to get more details about startup or runtime failures of pods as e.g. described in Application Introspection and Debugging or Debug Pods and Replication Controllers.

In order to identify pods with potential issues, you could, e.g., run kubectl get pods --all-namespaces | grep -iv Running to filter out the pods which are not in the state Running. One of frequent error state is CrashLoopBackOff, which tells that a pod crashes right after the start. Kubernetes then tries to restart the pod again, but often the pod startup fails again.

Here is a short list of possible reasons which might lead to a pod crash:

Error during image pull caused by e.g. wrong/missing secrets or wrong/missing image
The app runs in an error state caused e.g. by missing environmental variables (ConfigMaps) or secrets
Liveness probe failed
Too high resource consumption (memory and/or CPU) or too strict quota settings
Persistent volumes can’t be created/mounted
The container image is not updated

Basically, the commands kubectl logs ... and kubectl describe ... with different parameters are used to get more detailed information. By calling e.g. kubectl logs --help you can get more detailed information about the command and its parameters.

In the next sections you’ll find some basic approaches to get some ideas what went wrong.

Remarks:

Even if the pods seem to be running, as the status Running indicates, a high counter of the Restarts shows potential problems
You can get a good overview of the troubleshooting process with the interactive tutorial Troubleshooting with Kubectl available which explains basic debugging activities
The examples below are deployed into the namespace default. In case you want to change it, use the optional parameter --namespace <your-namespace> to select the target namespace. The examples require a Kubernetes release ≥ 1.8.

Prerequisites

Your deployment was successful (no logical/syntactical errors in the manifest files), but the pod(s) aren’t running.

Error Caused by Wrong Image Name

Start by running kubectl describe pod <your-pod> <your-namespace> to get detailed information about the pod startup.

In the Events section, you should get an error message like Failed to pull image ... and Reason: Failed. The pod is in state ImagePullBackOff.

The example below is based on a demo in the Kubernetes documentation. In all examples, the default namespace is used.

First, perform a cleanup with:

kubectl delete pod termination-demo

Next, create a resource based on the yaml content below:

apiVersion: v1
kind: Pod 
metadata:
  name: termination-demo
spec:
  containers:
  - name: termination-demo-container
    image: debiann
    command: ["/bin/sh"]
    args: ["-c", "sleep 10 && echo Sleep expired > /dev/termination-log"]

kubectl describe pod termination-demo lists in the Event section the content

Events:
  FirstSeen	LastSeen	Count	From							SubObjectPath					Type		Reason			Message
  ---------	--------	-----	----							-------------					--------	------			-------
  2m		2m		1	default-scheduler											Normal		Scheduled		Successfully assigned termination-demo to ip-10-250-17-112.eu-west-1.compute.internal
  2m		2m		1	kubelet, ip-10-250-17-112.eu-west-1.compute.internal							Normal		SuccessfulMountVolume	MountVolume.SetUp succeeded for volume "default-token-sgccm" 
  2m		1m		4	kubelet, ip-10-250-17-112.eu-west-1.compute.internal	spec.containers{termination-demo-container}	Normal		Pulling			pulling image "debiann"
  2m		1m		4	kubelet, ip-10-250-17-112.eu-west-1.compute.internal	spec.containers{termination-demo-container}	Warning		Failed			Failed to pull image "debiann": rpc error: code = Unknown desc = Error: image library/debiann:latest not found
  2m		54s		10	kubelet, ip-10-250-17-112.eu-west-1.compute.internal							Warning		FailedSync		Error syncing pod
  2m		54s		6	kubelet, ip-10-250-17-112.eu-west-1.compute.internal	spec.containers{termination-demo-container}	Normal		BackOff			Back-off pulling image "debiann"

The error message with Reason: Failed tells you that there is an error during pulling the image. A closer look at the image name indicates a misspelling.

The App Runs in an Error State Caused, e.g., by Missing Environmental Variables (ConfigMaps) or Secrets

This example illustrates the behavior in the case when the app expects environment variables but the corresponding Kubernetes artifacts are missing.

First, perform a cleanup with:

kubectl delete deployment termination-demo
kubectl delete configmaps app-env

Next, deploy the following manifest:

apiVersion: apps/v1beta2 
kind: Deployment
metadata:
  name: termination-demo
  labels:
     app: termination-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: termination-demo
  template:
    metadata:
      labels:
        app: termination-demo
    spec:
      containers:
      - name: termination-demo-container
        image: debian
        command: ["/bin/sh"]
        args: ["-c", "sed \"s/foo/bar/\" < $MYFILE"]

Now, the command kubectl get pods lists the pod termination-demo-xxx in the state Error or CrashLoopBackOff. The command kubectl describe pod termination-demo-xxx tells you that there is no error during startup but gives no clue about what caused the crash.

Events:
  FirstSeen	LastSeen	Count	From							SubObjectPath					Type		Reason		Message
  ---------	--------	-----	----							-------------					--------	------		-------
  19m		19m		1	default-scheduler											Normal		Scheduled	Successfully assigned termination-demo-5fb484867d-xz2x9 to ip-10-250-17-112.eu-west-1.compute.internal
  19m		19m		1	kubelet, ip-10-250-17-112.eu-west-1.compute.internal							Normal		SuccessfulMountVolume	MountVolume.SetUp succeeded for volume "default-token-sgccm" 
  19m		19m		4	kubelet, ip-10-250-17-112.eu-west-1.compute.internal	spec.containers{termination-demo-container}	Normal		Pulling		pulling image "debian"
  19m		19m		4	kubelet, ip-10-250-17-112.eu-west-1.compute.internal	spec.containers{termination-demo-container}	Normal		Pulled		Successfully pulled image "debian"
  19m		19m		4	kubelet, ip-10-250-17-112.eu-west-1.compute.internal	spec.containers{termination-demo-container}	Normal		Created		Created container
  19m		19m		4	kubelet, ip-10-250-17-112.eu-west-1.compute.internal	spec.containers{termination-demo-container}	Normal		Started		Started container
  19m		14m		24	kubelet, ip-10-250-17-112.eu-west-1.compute.internal	spec.containers{termination-demo-container}	Warning		BackOff		Back-off restarting failed container
  19m		4m		69	kubelet, ip-10-250-17-112.eu-west-1.compute.internal							Warning		FailedSync	Error syncing pod

The command kubectl get logs termination-demo-xxx gives access to the output, the application writes on stderr and stdout. In this case, you should get an output similar to:

/bin/sh: 1: cannot open : No such file

So you need to have a closer look at the application. In this case, the environmental variable MYFILE is missing. To fix this issue, you could e.g. add a ConfigMap to your deployment as is shown in the manifest listed below:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-env
data:
  MYFILE: "/etc/profile"
---
apiVersion: apps/v1beta2 
kind: Deployment
metadata:
  name: termination-demo
  labels:
     app: termination-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: termination-demo
  template:
    metadata:
      labels:
        app: termination-demo
    spec:
      containers:
      - name: termination-demo-container
        image: debian
        command: ["/bin/sh"]
        args: ["-c", "sed \"s/foo/bar/\" < $MYFILE"]
        envFrom:
        - configMapRef:
            name: app-env

Note that once you fix the error and re-run the scenario, you might still see the pod in a CrashLoopBackOff status. It is because the container finishes the command sed ... and runs to completion. In order to keep the container in a Running status, a long running task is required, e.g.:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-env
data:
  MYFILE: "/etc/profile"
  SLEEP: "5"
---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: termination-demo
  labels:
     app: termination-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: termination-demo
  template:
    metadata:
      labels:
        app: termination-demo
    spec:
      containers:
      - name: termination-demo-container
        image: debian
        command: ["/bin/sh"]
        # args: ["-c", "sed \"s/foo/bar/\" < $MYFILE"]
        args: ["-c", "while true; do sleep $SLEEP; echo sleeping; done;"]
        envFrom:
        - configMapRef:
            name: app-env

Too High Resource Consumption (Memory and/or CPU) or Too Strict Quota Settings

You can optionally specify the amount of memory and/or CPU your container gets during runtime. In case these settings are missing, the default requests settings are taken: CPU: 0m (in Milli CPU) and RAM: 0Gi, which indicate no other limits other than the ones of the node(s) itself. For more details, e.g. about how to configure limits, see Configure Default Memory Requests and Limits for a Namespace.

In case your application needs more resources, Kubernetes distinguishes between requests and limit settings: requests specify the guaranteed amount of resource, whereas limit tells Kubernetes the maximum amount of resource the container might need. Mathematically, both settings could be described by the relation 0 <= requests <= limit. For both settings you need to consider the total amount of resources your nodes provide. For a detailed description of the concept, see Resource Quality of Service in Kubernetes.

Use kubectl describe nodes to get a first overview of the resource consumption in your cluster. Of special interest are the figures indicating the amount of CPU and Memory Requests at the bottom of the output.

The next example demonstrates what happens in case the CPU request is too high in order to be managed by your cluster.

First, perform a cleanup with:

kubectl delete deployment termination-demo
kubectl delete configmaps app-env

Next, adapt the cpu below in the yaml below to be slightly higher than the remaining CPU resources in your cluster and deploy this manifest. In this example, 600m (milli CPUs) are requested in a Kubernetes system with a single 2 core worker node which results in an error message.

apiVersion: apps/v1beta2 
kind: Deployment
metadata:
  name: termination-demo
  labels:
     app: termination-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: termination-demo
  template:
    metadata:
      labels:
        app: termination-demo
    spec:
      containers:
      - name: termination-demo-container
        image: debian
        command: ["/bin/sh"]
        args: ["-c", "sleep 10 && echo Sleep expired > /dev/termination-log"]
        resources:
          requests:
            cpu: "600m"

The command kubectl get pods lists the pod termination-demo-xxx in the state Pending. More details on why this happens could be found by using the command kubectl describe pod termination-demo-xxx:

$ kubectl describe po termination-demo-fdb7bb7d9-mzvfw
Name:           termination-demo-fdb7bb7d9-mzvfw
Namespace:      default
...
Containers:
  termination-demo-container:
    Image:      debian
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/sh
    Args:
      -c
      sleep 10 && echo Sleep expired > /dev/termination-log
    Requests:
      cpu:        6
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-t549m (ro)
Conditions:
  Type           Status
  PodScheduled   False
Events:
  Type     Reason            Age               From               Message
  ----     ------            ----              ----               -------
  Warning  FailedScheduling  9s (x7 over 40s)  default-scheduler  0/2 nodes are available: 2 Insufficient cpu.

You can find more details in:

Remarks:

This example works similarly when specifying a too high request for memory
In case you configured an autoscaler range when creating your Kubernetes cluster, another worker node will be spinned up automatically if you didn’t reach the maximum number of worker nodes
In case your app is running out of memory (the memory settings are too small), you will typically find an OOMKilled (Out Of Memory) message in the Events section of the kubectl describe pod ... output

The Container Image Is Not Updated

You applied a fix in your app, created a new container image and pushed it into your container repository. After redeploying your Kubernetes manifests, you expected to get the updated app, but the same bug is still in the new deployment present.

This behavior is related to how Kubernetes decides whether to pull a new docker image or to use the cached one.

In case you didn’t change the image tag, the default image policy IfNotPresent tells Kubernetes to use the cached image (see Images).

As a best practice, you should not use the tag latest and change the image tag in case you changed anything in your image (see Configuration Best Practices).

For more information, see Container Image Not Updating.

4 - tail -f /var/log/my-application.log

Aggregate log files from different pods

Problem

One thing that always bothered me was that I couldn’t get logs of several pods at once with kubectl. A simple tail -f <path-to-logfile> isn’t possible at all. Certainly, you can use kubectl logs -f <pod-id>, but it doesn’t help if you want to monitor more than one pod at a time.

This is something you really need a lot, at least if you run several instances of a pod behind a deployment. This is even more so if you don’t have a Kibana or a similar setup.

Solution

Luckily, there are smart developers out there who always come up with solutions. The finding of the week is a small bash script that allows you to aggregate log files of several pods at the same time in a simple way. The script is called kubetail and is available at GitHub.

Monitor and Troubleshoot

1 - Analyzing Node Removal and Failures

Overview

Find Out Whether the Node Was unhealthy

Check the Node Events

Check machine-controller-manager Logs

Check the Node Logs

Check the Network Problem Detector Dashboard

Scale-Down

Node Rolling

2 - Get a Shell to a Gardener Shoot Worker Node

Overview

Get a Shell to an Operational Cluster Node

Gardener Dashboard

Result

Gardener Ops Toolbelt

Custom Root Pod

SSH Access to a Node That Failed to Join the Cluster

Identifying the Problematic Instance

gardenctl ssh

SSH with a Manually Created Bastion on AWS

Create the Bastion Security Group

Create the Bastion Instance

Connecting to the Target Instance

Cleanup

3 - How to Debug a Pod

Introduction

Prerequisites

Error Caused by Wrong Image Name

The App Runs in an Error State Caused, e.g., by Missing Environmental Variables (ConfigMaps) or Secrets

Too High Resource Consumption (Memory and/or CPU) or Too Strict Quota Settings

The Container Image Is Not Updated

Related Links

4 - tail -f /var/log/my-application.log

Problem

Solution

Find Out Whether the Node Was `unhealthy`