7 minute read  

Shoot Cluster Maintenance

Day two operations for shoot clusters are related to:

  • The Kubernetes version of the control plane and the worker nodes
  • the operating system version of the worker nodes

The following table summarizes what options Gardener offers to maintain these versions:

Auto-UpdateForceful UpdatesManual updates
Kubernetes versionPatches onlyPatches and consecutive minor updates onlyyes
Operating system versionyesyesyes

Allowed Target Versions in the CloudProfile

Administrators maintain the allowed target versions that you can update to in the CloudProfile for each IaaS-Provider. Users with access to a Gardener project can check supported target versions with:

kubectl get cloudprofile [IAAS-SPECIFIC-PROFILE] -o yaml
PathDescriptionMore information
spec.kubernetes.versionsThe supported Kubernetes version major.minor.patch.Patch releases
spec.machineImagesThe supported operating system versions for worker nodes.

Both the Kubernetes version, and the operating system version follow semantic versioning that allows Gardener to handle updates automatically.

More information: Semantic Versioning.

Impact of Version Classifications on Updates

Gardener allows to classify versions in the CloudProfile as preview, supported, deprecated, or expired. During maintenance operations, preview versions are excluded from updates, because they’re often recently released versions that haven’t yet undergone thorough testing and may contain bugs or security issues.

More information: Version Classifications.

Let Gardener manage your updates

The Maintenance Window

Gardener can manage updates for you automatically. It offers users to specify a maintenance window during which updates are scheduled:

  • The time interval of the maintenance window can’t be less than 30 minutes or more than 6 hours.
  • If there’s no maintenance window specified during the creation of a shoot cluster, Gardener chooses a maintenance window randomly to spread the load.

You can either specify the maintenance window in the shoot cluster specification (.spec.maintenance.timeWindow) or the start time of the maintenance window using the Gardener dashboard (CLUSTERS > [YOUR-CLUSTER] > OVERVIEW > Lifecycle > Maintenance).

Auto-Update and Forceful Updates

To trigger updates during the maintenance window automatically, Gardener offers the following methods:

  • Auto-update:
    Gardener starts an update during the next maintenance window whenever there’s a version available in the CloudProfile that is higher than the one of your shoot cluster specification, and that isn’t classified as preview version. For Kubernetes versions, auto-update only updates to higher patch levels.

    You can either activate auto-update on the Gardener dashboard (CLUSTERS > [YOUR-CLUSTER] > OVERVIEW > Lifecycle > Maintenance) or in the shoot cluster specification:

    • .spec.maintenance.autoUpdate.kubernetesVersion: true
    • .spec.maintenance.autoUpdate.machineImageVersion: true
  • Forceful updates:
    In the maintenance window, Gardener compares the current version given in the shoot cluster specification with the version list in the CloudProfile. If the version has an expiration date and if the date is before the start of the maintenance window, Gardener starts an update to the highest version available in the CloudProfile that isn’t classified as preview version. The highest version in CloudProfile can’t have an expiration date. For Kubernetes versions, Gardener only updates to higher patch levels or consecutive minor versions.

If you don’t want to wait for the next maintenance window, you can annotate the shoot cluster specification with shoot.gardener.cloud/operation: maintain. Gardener then checks immediately if there’s an auto-update or a forceful update needed.

With expiration dates, administrators can give shoot cluster owners more time for testing before the actual version update happens, which allows smoother transitions to new versions.

Kubernetes Update Paths

The bigger the delta of the Kubernetes source version and the Kubernetes target version, the better it must be planned and executed by operators. Gardener only provides automatic support for updates that can be applied safely to the cluster workload:

Update TypeExampleUpdate method
Patches1.10.12 to 1.10.13auto-update or Forceful update
Update to consecutive minor version1.10.12 to 1.11.10Forceful update
Other1.10.12 to 1.12.0manual update

Gardener doesn’t support automatic updates of nonconsecutive minor versions, because Kubernetes doesn’t guarantee updateability in this case. However, multiple minor version updates are possible if not only the minor source version is expired, but also the minor target version is expired. Gardener then updates the Kubernetes version first to the expired target version, and waits for the next maintenance window to update this version to the next minor target version.

Manual Updates

To update the Kubernetes version or the node operating system manually, change the .spec.kubernetes.version field or the .spec.provider.workers.machine.image.version field correspondingly.

Manual updates are required if you would like to do a minor update of the Kubernetes version. Gardener doesn’t do such updates automatically as they can have breaking changes that could impact the cluster workload.

Manual updates are either executed immediately (default) or can be confined to the maintenance time window.
Choosing the latter option, causes changes to the cluster (for example, node pool rolling-updates) and the subsequent reconciliation, to only predictably happen during a defined time window (available since Gardener version 1.4).

More information: Confine Specification Changes/Update Roll Out.

Examples

In the examples for the CloudProfile and the shoot cluster specification, only the fields relevant for the example are shown.

Auto-Update of Kubernetes Version

Let’s assume Kubernetes version 1.10.5 and 1.11.0 were added in the following CloudProfile:

spec:
  kubernetes:
    versions:
    - version: 1.11.0
    - version: 1.10.5
    - version: 1.10.0

Before this change, the shoot cluster specification looked like this:

spec:
  kubernetes:
    version: 1.10.0
  maintenance:
    timeWindow:
      begin: 220000+0000
      end: 230000+0000
    autoUpdate:
      kubernetesVersion: true

As a consequence, the shoot cluster is updated to Kubernetes version 1.10.5 between 22:00-23:00 UTC. Your shoot cluster isn’t updated automatically to 1.11.0 even though it’s the highest Kubernetes version in the CloudProfile, because Gardener does only do automatic updates of the Kubernetes patch level.

Forceful Update Due to Expired Kubernetes Version

Let’s assume the following CloudProfile:

spec:
  kubernetes:
    versions:
    - version: 1.12.8
    - version: 1.11.10
    - version: 1.10.13
    - version: 1.10.12
      expirationDate: "2019-04-13T08:00:00Z"

Let’s assume the shoot cluster has the following specification:

spec:
  kubernetes:
    version: 1.10.12
  maintenance:
    timeWindow:
      begin: 220000+0100
      end: 230000+0100
    autoUpdate:
      kubernetesVersion: false

The shoot cluster specification refers a Kubernetes version that has an expirationDate. In the maintenance window on 2019-04-12, the Kubernetes version stays the same as it’s still not expired. But in the maintenance window on 2019-04-14 the Kubernetes version of the shoot cluster is updated to 1.10.13 (independently of the value of .spec.maintenance.autoUpdate.kubernetesVersion).

Forceful Update to New Minor Kubernetes Version

Let’s assume the following CloudProfile:

spec:
  kubernetes:
    versions:
    - version: 1.12.8
    - version: 1.11.10
    - version: 1.11.09
    - version: 1.10.12
      expirationDate: "2019-04-13T08:00:00Z"

Let’s assume the shoot cluster has the following specification:

spec:
  kubernetes:
    version: 1.10.12
  maintenance:
    timeWindow:
      begin: 220000+0100
      end: 230000+0100
    autoUpdate:
      kubernetesVersion: false

The shoot cluster specification refers a Kubernetes version that has an expirationDate. In the maintenance window on 2019-04-14, the Kubernetes version of the shoot cluster is updated to 1.11.10, which is the highest patch version of minor target version 1.11 that follows source version 1.10.

Automatic Update from Expired Machine Image Version

Let’s assume the following CloudProfile:

spec:
  machineImages:
  - name: coreos
    versions:
    - version: 2191.5.0
    - version: 2191.4.1
    - version: 2135.6.0
      expirationDate: "2019-04-13T08:00:00Z"

Let’s assume the shoot cluster has the following specification:

spec:
  provider:
    type: aws
    workers:
    - name: name
      maximum: 1
      minimum: 1
      maxSurge: 1
      maxUnavailable: 0
      image:
        name: coreos
        version: 2135.6.0
        type: m5.large
      volume:
        type: gp2
        size: 20Gi
  maintenance:
    timeWindow:
      begin: 220000+0100
      end: 230000+0100
    autoUpdate:
      machineImageVersion: false

The shoot cluster specification refers a machine image version that has an expirationDate. In the maintenance window on 2019-04-12, the machine image version stays the same as it’s still not expired. But in the maintenance window on 2019-04-14 the machine image version of the shoot cluster is updated to 2191.5.0 (independently of the value of .spec.maintenance.autoUpdate.machineImageVersion) as version 2135.6.0 is expired.