그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그
10 minute read
Checklist For Adding New Components
Adding new components that run in the garden, seed, or shoot cluster is theoretically quite simple - we just need a
Deployment (or other similar workload resource), the respective container image, and maybe a bit of configuration.
In practice, however, there are a couple of things to keep in mind in order to make the deployment production-ready.
This document provides a checklist for them that you can walk through.
Avoid usage of Helm charts (example)
Nowadays, we use Golang components instead of Helm charts for deploying components to a cluster. Please find a typical structure of such components in the provided metrics_server.go file (configuration values are typically managed in a
Valuesstructure). There are a few exceptions (e.g., Istio) still using charts, however the default should be using a Golang-based implementation. For the exceptional cases, use Golang’s embed package to embed the Helm chart directory (example 1, example 2).
For historic reasons, resources related to shoot control plane components are applied directly with the client. All other resources (seed or shoot system components) are deployed via
gardener-resource-manager’s Resource controller (
ManagedResources) since it performs health checks out-of-the-box and has a lot of other features (see its documentation for more information). Components that can run as both seed system component or shoot control plane component (e.g., VPA or
kube-state-metrics) can make use of these utility functions.
Secrets are immutable for modification and have a unique name. This has a couple of benefits, e.g. the
kubeletdoesn’t watch these resources, and it is always clear which resource contains which data since it cannot be changed. As a consequence, unique/immutable
Secretare superior to checksum annotations on the pod templates. Stale/unused
Secrets are garbage-collected by
gardener-resource-manager’s GarbageCollector. There are utility functions (see examples above) for using unique
Secrets in Golang components. It is essential to inject the annotations into the workload resource to make the garbage-collection work.
Note that some
Secrets should not be unique (e.g., those containing monitoring or logging configuration). The reason is that the old revision stays in the cluster even if unused until the garbage-collector acts. During this time, they would be wrongly aggregated to the full configuration.
You should use the secrets manager for the management of any kind of credentials. This makes sure that credentials rotation works out-of-the-box without you requiring to think about it. Generally, do not use client certificates (see the Security section).
Consider hibernation when calculating replica count (example)
Shoot clusters can be hibernated meaning that all control plane components in the shoot namespace in the seed cluster are scaled down to zero and all worker nodes are terminated. If your component runs in the seed cluster then you have to consider this case and provide the proper replica count. There is a utility function available (see example).
Only define the minimum of needed dependency tasks in the shoot reconciliation/deletion flows.
Handle shoot system components
Shoot system components deployed by
gardener-resource-managerare labelled with
resource.gardener.cloud/managed-by: gardener. This makes Gardener adding required label selectors and tolerations so that non-
Pods will exclusively run on selected nodes (for more information, see System Components Webhook).
DaemonSets on the other hand, should generally tolerate any
NoExecutetaints so that they can run on any
Node, regardless of user added taints.
We define all image references centrally in the
imagevector/images.yamlfile. Hence, the image references must not be hard-coded in the pod template spec but read from this so-called image vector instead.
Registries such as ECR, GHCR (
ghcr.io), MCR (
mcr.microsoft.com) don’t support pulling images over IPv6.
Check if the upstream image is being also maintained in a registry that support IPv6 natively such as Artifact Registry, Quay (
quay.io). If there is such image, use the image from registry with IPv6 support.
If the image is not available in a registry with IPv6 then copy the image to the gardener GCR. There is a prow job copying images that are needed in gardener components from a source registry to the gardener GCR under the prefix
eu.gcr.io/gardener-project/3rd/(see the documentation or gardener/ci-infra#619).
If you want to use a new image from a registry without IPv6 support or upgrade an already used image to a newer tag, please open a PR to the ci-infra repository that modifies the job’s list of images to copy:
There is a strict rate-limit that applies to the Docker Hub registry. As described in 2., use another registry (if possible) or copy the image to the gardener GCR.
Do not use Shoot container images that are not multi-arch
Gardener supports Shoot clusters with both
arm64based worker Nodes.
amd64container images cannot run on
arm64worker Nodes and vice-versa.
Components that need to talk to the API server of their runtime cluster must always use a dedicated
ServiceAccount(do not use
false. This makes
gardener-resource-manager’s TokenInvalidator invalidate the static token secret and its
ProjectedTokenMountwebhook inject a projected token automatically.
Use shoot access tokens instead of a client certificates (example)
For components that need to talk to a target cluster different from their runtime cluster (e.g., running in seed cluster but talking to shoot) the
gardener-resource-manager’s TokenRequestor should be used to manage a so-called “shoot access token”.
Define RBAC roles with minimal privileges (example)
ServiceAccount(if it exists) should have as little privileges as possible. Consequently, please define proper RBAC roles for it. This might include a combination of
Roles. Please do not provide elevated privileges due to laziness (e.g., because there is already a
ClusterRolethat can be extended vs. creating a
Roleonly when access to a single namespace is needed).
NetworkPolicys to restrict network traffic
You should restrict both ingress and egress traffic to/from your component as much as possible to ensure that it only gets access to/from other components if really needed. Gardener provides a few default policies for typical usage scenarios. For more information, see
NetworkPolicys In Garden, Seed, Shoot Clusters.
Avoid running containers with
privileged=true. Instead, define the needed Linux capabilities.
Do not run containers as root (example)
Avoid runnig containers as root. Usually, components such as Kubernetes controllers and admission webhook servers don’t need root user capabilities to do their jobs.
The problem with running as root, starts with how the container is first built. Unless a non-privileged user is configured in the
Dockerfile, container build systems by default set up the container with the root user. Add a non-privileged user to your
Dockerfileor use a base image with a non-root user (for example the
nonrootimages from distroless such as
If the image is an upstream one, then consider configuring a securityContext for the container/Pod with a non-privileged user. For more information, see Configure a Security Context for a Pod or Container.
For components deployed in the Seed cluster, the Seccomp profile will be defaulted to
gardener-resource-manager’s SeccompProfile webhook which works well for the majority of components. However, in some special cases you might need to overwrite it.
gardener-resource-manager’s SeccompProfile webhook is not enabled for a Shoot cluster. For components deployed in the Shoot cluster, it is required [*] to explicitly specify the Seccomp profile.
[*] It is required because if a component deployed in the Shoot cluster does not specify a Seccomp profile and cannot run with the
RuntimeDefaultSeccomp profile, then enabling the
.spec.kubernetes.kubelet.seccompDefaultfield in the Shoot spec would break the corresponding component.
PodSecurityPolicys are deprecated, however Gardener still supports shoot clusters with older Kubernetes versions (ref). To make sure that such clusters can run with
.spec.kubernetes.allowPrivilegedContainers=false, you have to define proper
PodSecurityPolicys. For more information, see Pod Security.
High Availability / Stability
Specify the component type label for high availability (example)
To support high-availability deployments,
gardener-resource-managers HighAvailabilityConfig webhook injects the proper specification like replica or topology spread constraints. You only need to specify the type label. For more information, see High Availability Of Deployed Components.
Closely related to high availability but also to stability in general: The definition of a
maxUnavailable=1should be provided by default.
Choose the right
Consider defining liveness and readiness probes (example)
To ensure smooth rolling update behaviour, consider the definition of liveness and/or readiness probes.
Mark node-critical components (example)
To ensure user workload pods are only scheduled to
Nodeswhere all node-critical components are ready, these components need to tolerate the
NoScheduleeffect). Also, such
DaemonSetsand the included
PodTemplatesneed to be labelled with
node.gardener.cloud/critical-component=true. For more information, see Readiness of Shoot Worker Nodes.
Consider making a
To reduce costs and to improve the network traffic latency in multi-zone Seed clusters, consider making a
Servicetopology-aware, if applicable. In short, when a
Serviceis topology-aware, Kubernetes routes network traffic to the
Pods) which are located in the same zone where the traffic originated from. In this way, the cross availability zone traffic is avoided. See Topology-Aware Traffic Routing.
Provide resource requirements (example)
All components should have resource requirements. Generally, they should always request CPU and memory, while only memory shall be limited (no CPU limits!).
We typically perform vertical auto-scaling via the VPA managed by the Kubernetes community. Each component should have a respective
VerticalPodAutoscalerwith “min allowed” resources, “auto update mode”, and “requests only”-mode. VPA is always enabled in garden or seed clusters, while it is optional for shoot clusters.
HorizontalPodAutoscalerif needed (example)
If your component is capable of scaling horizontally, you should consider defining a
Observability / Operations Productivity
Components should provide scrape configuration and alerting rules for Prometheus/Alertmanager if appropriate. This should be done inside a dedicated
monitoring.gofile. Extensions should follow the guidelines described in Extensions Monitoring Integration.
Components should provide parsers and filters for fluent-bit, if appropriate. This should be done inside a dedicated
logging.gofile. Extensions should follow the guidelines described in Fluent-bit log parsers and filters.
In order to allow easy inspection of two
ReplicaSets to quickly find the changes that lead to a rolling update, the revision history limit should be set to
gardenlet’s care controllers regularly check the health status of system or control plane components. You need to enhance the lists of components to check if your component related to the seed system or shoot control plane (shoot system components are automatically checked via their respective
ManagedResourceconditions), see examples above.
Gardener offers to restart components during the maintenance time window. For more information, see Restart Control Plane Controllers and Restart Some Core Addons. You can consider adding the needed label to your control plane component to get this automatic restart (probably not needed for most components).