그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그
12 minute read
Adding Support For a New Kubernetes Version
This document describes the steps needed to perform in order to confidently add support for a new Kubernetes minor version.
⚠️ Typically, once a minor Kubernetes version
vX.Y
is supported by Gardener, then all patch versionsvX.Y.Z
are also automatically supported without any required action. This is because patch versions do not introduce any new feature or API changes, so there is nothing that needs to be adapted ingardener/gardener
code.
The Kubernetes community release a new minor version roughly every 4 months. Please refer to the official documentation about their release cycles for any additional information.
Shortly before a new release, an “umbrella” issue should be opened which is used to collect the required adaptations and to track the work items.
For example, #5102 can be used as a template for the issue description.
As you can see, the task of supporting a new Kubernetes version also includes the provider extensions maintained in the gardener
GitHub organization and is not restricted to gardener/gardener
only.
Generally, the work items can be split into two groups: The first group contains tasks specific to the changes in the given Kubernetes release, the second group contains Kubernetes release-independent tasks.
ℹ️ Upgrading the
k8s.io/*
andsigs.k8s.io/controller-runtime
Golang dependencies is typically tracked and worked on separately (see e.g. #4772 or #5282).
Deriving Release-Specific Tasks
Most new minor Kubernetes releases incorporate API changes, deprecations, or new features.
The community announces them via their change logs.
In order to derive the release-specific tasks, the respective change log for the new version vX.Y
has to be read and understood (for example, the changelog for v1.24
).
As already mentioned, typical changes to watch out for are:
- API version promotions or deprecations
- Feature gate promotions or deprecations
- CLI flag changes for Kubernetes components
- New default values in resources
- New available fields in resources
- New features potentially relevant for the Gardener system
- Changes of labels or annotations Gardener relies on
- …
Obviously, this requires a certain experience and understanding of the Gardener project so that all “relevant changes” can be identified.
While reading the change log, add the tasks (along with the respective PR in kubernetes/kubernetes
to the umbrella issue).
ℹ️ Some of the changes might be specific to certain cloud providers. Pay attention to those as well and add related tasks to the issue.
List Of Release-Independent Tasks
The following paragraphs describe recurring tasks that need to be performed for each new release.
Make Sure a New hyperkube
Image Is Released
The gardener/hyperkube
repository is used to release container images consisting of the kubectl
and kubelet
binaries.
There is a CI/CD job that runs periodically and releases a new hyperkube
image when there is a new Kubernetes release. Before proceeding with the next steps, make sure that a new hyperkube
image is released for the corresponding new Kubernetes minor version. Make sure that container image is present in GCR.
Adapting Gardener
- Allow instantiation of a Kubernetes client for the new minor version and update the
README.md
: - Maintain the Kubernetes feature gates used for validation of
Shoot
resources:- The feature gates are maintained in this file.
- To maintain this list for new Kubernetes versions, run
hack/compare-k8s-feature-gates.sh <old-version> <new-version>
(e.g.hack/compare-k8s-feature-gates.sh v1.26 v1.27
). - It will present 3 lists of feature gates: those added and those removed in
<new-version>
compared to<old-version>
and feature gates that got locked to default in<new-version>
. - Add all added feature gates to the map with
<new-version>
asAddedInVersion
and noRemovedInVersion
. - For any removed feature gates, add
<new-version>
asRemovedInVersion
to the already existing feature gate in the map. - For feature gates locked to default, add
<new-version>
asLockedToDefaultInVersion
to the already existing feature gate in the map. - See this example commit.
- Maintain the Kubernetes
kube-apiserver
admission plugins used for validation ofShoot
resources:- The admission plugins are maintained in this file.
- To maintain this list for new Kubernetes versions, run
hack/compare-k8s-admission-plugins.sh <old-version> <new-version>
(e.g.hack/compare-k8s-admission-plugins.sh 1.26 1.27
). - It will present 2 lists of admission plugins: those added and those removed in
<new-version>
compared to<old-version>
. - Add all added admission plugins to the
admissionPluginsVersionRanges
map with<new-version>
asAddedInVersion
and noRemovedInVersion
. - For any removed admission plugins, add
<new-version>
asRemovedInVersion
to the already existing admission plugin in the map. - Flag any admission plugins that are required (plugins that must not be disabled in the
Shoot
spec) by setting theRequired
boolean variable to true for the admission plugin in the map. - Flag any admission plugins that are forbidden by setting the
Forbidden
boolean variable to true for the admission plugin in the map.
- Maintain the Kubernetes
kube-apiserver
API groups used for validation ofShoot
resources:- The API groups are maintained in this file.
- To maintain this list for new Kubernetes versions, run
hack/compare-k8s-api-groups.sh <old-version> <new-version>
(e.g.hack/compare-k8s-api-groups.sh 1.26 1.27
). - It will present 2 lists of API GroupVersions and 2 lists of API GroupVersionResources: those added and those removed in
<new-version>
compared to<old-version>
. - Add all added group versions to the
apiGroupVersionRanges
map and group version resources to theapiGVRVersionRanges
map with<new-version>
asAddedInVersion
and noRemovedInVersion
. - For any removed APIs, add
<new-version>
asRemovedInVersion
to the already existing API in the corresponding map. - Flag any APIs that are required (APIs that must not be disabled in the
Shoot
spec) by setting theRequired
boolean variable to true for the API in theapiGVRVersionRanges
map. If this API also should not be disabled for Workerless Shoots, then setRequiredForWorkerless
boolean variable also to true. If the API is required for both Shoot types, then both of these booleans need to be set to true. If the whole API Group is required, then mark it correspondingly in theapiGroupVersionRanges
map.
- Maintain the Kubernetes
kube-controller-manager
controllers for each API group used in deploying required KCM controllers based on active APIs:- The API groups are maintained in this file.
- To maintain this list for new Kubernetes versions, run
hack/compute-k8s-controllers.sh <old-version> <new-version>
(e.g.hack/compute-k8s-controllers.sh 1.28 1.29
). - If it complains that the path for the controller is not present in the map, check the release branch of the new Kubernetes version and find the correct path for the missing/wrong controller. You can do so by checking the file
cmd/kube-controller-manager/app/controllermanager.go
and where the controller is initialized from. As of now, there is no straight-forward way to map each controller to its file. If this has improved, please enhance the script. - If the paths are correct, it will present 2 lists of controllers: those added and those removed for each API group in
<new-version>
compared to<old-version>
. - Add all added controllers to the
APIGroupControllerMap
map and under the corresponding API group with<new-version>
asAddedInVersion
and noRemovedInVersion
. - For any removed controllers, add
<new-version>
asRemovedInVersion
to the already existing controller in the corresponding API group map. If you are unable to find the removed controller name, then check for its alias. Either in thestaging/src/k8s.io/cloud-provider/names/controller_names.go
file (example) or in thecmd/kube-controller-manager/app/*
files (example for apps API group). This is because for kubernetes versions starting fromv1.28
, we don’t maintain the aliases in the controller, but the controller names itself since some controllers can be initialized without aliases as well (example). The old alias should still be working since it should be backwards compatible as explained here. Once the support for kubernetes version <v1.28
is droppped, we can drop the usages of these aliases and move completely to controller names. - Make sure that the API groups in this file are in sync with the groups in this file. For example,
core/v1
is replaced by the script asv1
andapiserverinternal
asinternal
. This is because the API groups registered by the apiserver (example) and the file path imported by the controllers (example) might be slightly different in some cases.
- Maintain the
ServiceAccount
names for the controllers part ofkube-controller-manager
:- The names are maintained in this file.
- To maintain this list for new Kubernetes versions, run
hack/compare-k8s-controllers.sh <old-version> <new-version>
(e.g.hack/compare-k8s-controllers.sh 1.26 1.27
). - It will present 2 lists of controllers: those added and those removed in
<new-version>
compared to<old-version>
. - Double check whether such
ServiceAccount
indeed appears in thekube-system
namespace when creating a cluster with<new-version>
. Note that it sometimes might be hidden behind a default-off feature gate. You can create a local cluster with the new version using the local provider. It could so happen that the name of the controller is used in the form of a constant and not a string, see example, In that case not the value of the constant separetely. You could also cross check the names with the result of thecompute-k8s-controllers.sh
script used in the previous step. - If it appears, add all added controllers to the list based on the Kubernetes version (example).
- For any removed controllers, add them only to the Kubernetes version if it is low enough.
- Maintain the names of controllers used for workerless Shoots, here after carefully evaluating whether they are needed if there are no workers.
- Maintain copies of the
DaemonSet
controller’s scheduling logic:gardener-resource-manager
’sNode
controller uses a copy of parts of theDaemonSet
controller’s logic for determining whether a specificNode
should run a daemon pod of a givenDaemonSet
: see this file.- Check the referenced upstream files for changes to the
DaemonSet
controller’s logic and adapt our copies accordingly. This might include introducing version-specific checks in our codebase to handle different shoot cluster versions.
- Maintain version specific defaulting logic in shoot admission plugin:
- Sometimes default values for shoots are intentionally changed with the introduction of a new Kubernetes version.
- The final Kubernetes version for a shoot is determined in the Shoot Validator Admission Plugin.
- Any defaulting logic that depends on the version should be placed in this admission plugin (example).
- Ensure that maintenance-controller is able to auto-update shoots to the new Kubernetes version. Changes to the shoot spec required for the Kubernetes update should be enforced in such cases (examples).
- Add the new Kubernetes version to the CloudProfile in local setup.
- See this example commit.
- In the next Gardener release, file a PR that bumps the used Kubernetes version for local e2e test.
- This step must be performed in a PR that targets the next Gardener release because of the e2e upgrade tests. The e2e upgrade tests deploy the previous Gardener version where the new Kubernetes version is not present in the CloudProfile. If the e2e tests are adapted in the same PR that adds the support for the Kubernetes version, then the e2e upgrade tests for that PR will fail because the newly added Kubernetes version in missing in the local CloudProfile from the old release.
- See this example commit PR.
Filing the Pull Request
Work on all the tasks you have collected and validate them using the local provider. Execute the e2e tests and if everything looks good, then go ahead and file the PR (example PR). Generally, it is great if you add the PRs also to the umbrella issue so that they can be tracked more easily.
Adapting Provider Extensions
After the PR in gardener/gardener
for the support of the new version has been merged, you can go ahead and work on the provider extensions.
Actually, you can already start even if the PR is not yet merged and use the branch of your fork.
- Update the
github.com/gardener/gardener
dependency in the extension and update theREADME.md
. - Work on release-specific tasks related to this provider.
Maintaining the cloud-controller-manager
Images
Provider extensions are using upstream cloud-controller-manager
images.
Make sure to adopt the new cloud-controller-manager
release for the new Kubernetes minor version (example PR).
Some of the cloud providers are not using upstream cloud-controller-manager
images for some of the supported Kubernetes versions.
Instead, we build and maintain the images ourselves:
Use the instructions below in case you need to maintain a release branch for such cloud-controller-manager
image:
Expand the instructions!
Until we switch to upstream images, you need to update the Kubernetes dependencies and release a new image. The required steps are as follows:
- Checkout the
legacy-cloud-provider
branch of the respective repository - Bump the versions in the
Dockerfile
(example commit). - Update the
VERSION
tovX.Y.Z-dev
whereZ
is the latest available Kubernetes patch version for thevX.Y
minor version. - Update the
k8s.io/*
dependencies in thego.mod
file tovX.Y.Z
and rungo mod tidy
(example commit). - Checkout a new
release-vX.Y
branch and release it (example)
As you are already on it, it is great if you also bump the
k8s.io/*
dependencies for the last three minor releases as well. In this case, you need to checkout therelease-vX.{Y-{1,2,3}}
branches and only perform the last three steps (example branch, example commit).
Now you need to update the new releases in the imagevector/images.yaml
of the respective provider extension so that they are used (see this example commit for reference).
Maintaining Additional Images
Provider extensions might also deploy additional images other than cloud-controller-manager
that are specific for a given Kubernetes minor version.
Make sure to use a new image for the following components:
The
ecr-credential-provider
image for the provider-aws extension.We are building the
ecr-credential-provider
image ourselves because the upstream community does not provide an OCI image for the corresponding component. For more details, see this upstream issue.Use the following steps to prepare a release of the
ecr-credential-provider
image for the new Kubernetes minor version:- Update the
VERSION
file in the gardener/ecr-credential-provider repository (example PR). - Once the PR is merged, trigger a new release from the CI/CD.
- Update the
The
csi-driver-cinder
andcsi-driver-manila
images for the provider-openstack extension.The upstream community is providing
csi-driver-cinder
andcsi-driver-manila
releases per Kubernetes minor version. Make sure to adopt the newcsi-driver-cinder
andcsi-driver-manila
releases for the new Kubernetes minor version (example PR).
Filing the Pull Request
Again, work on all the tasks you have collected. This time, you cannot use the local provider for validation but should create real clusters on the various infrastructures. Typically, the following validations should be performed:
- Create new clusters with versions <
vX.Y
- Create new clusters with version =
vX.Y
- Upgrade old clusters from version
vX.{Y-1}
to versionvX.Y
- Delete clusters with versions <
vX.Y
- Delete clusters with version =
vX.Y
If everything looks good, then go ahead and file the PR (example PR). Generally, it is again great if you add the PRs also to the umbrella issue so that they can be tracked more easily.