Local Provider Extension
The "local provider" extension is used to allow the usage of seed and shoot clusters which run entirely locally without any real infrastructure or cloud provider involved. It implements Gardener's extension contract (GEP-1) and thus comprises several controllers and webhooks acting on resources in seed and shoot clusters.
The code is maintained in pkg/provider-local.
Motivation
The motivation for maintaining such extension is the following:
- 🛡 Output Qualification: Run fast and cost-efficient end-to-end tests, locally and in CI systems (increased confidence ⛑ before merging pull requests)
- ⚙️ Development Experience: Develop Gardener entirely on a local machine without any external resources involved (improved costs 💰 and productivity 🚀)
- 🤝 Open Source: Quick and easy setup for a first evaluation of Gardener and a good basis for first contributions
Current Limitations
The following enlists the current limitations of the implementation. Please note that all of them are not technical limitations/blockers, but simply advanced scenarios that we haven't had invested yet into.
No load balancers for Shoot clusters.
We have not yet developed a
cloud-controller-managerwhich could reconcile load balancerServices in the shoot cluster.In case a seed cluster with multiple availability zones, i.e. multiple entries in
.spec.provider.zones, is used in conjunction with a single-zone shoot control plane, i.e. a shoot cluster without.spec.controlPlane.highAvailabilityor with.spec.controlPlane.highAvailability.failureTolerance.typeset tonode, the local address of the API server endpoint needs to be determined manually or via the in-clustercoredns.As the different istio ingress gateway loadbalancers have individual external IP addresses, single-zone shoot control planes can end up in a random availability zone. Having the local host use the
corednsin the cluster as name resolver would form a name resolution cycle. The tests mitigate the issue by adapting the DNS configuration inside the affected test.
ManagedSeeds
It is possible to deploy ManagedSeeds with provider-local by first creating a Shoot in the garden namespace and then creating a referencing ManagedSeed object.
Please note that this is only supported by the
Skaffold-based setup.
The corresponding e2e test can be run via:
./hack/test-e2e-local.sh --label-filter "ManagedSeed"Implementation Details
The images locally built by Skaffold for the Gardener components which are deployed to this shoot cluster are managed by a container registry in the registry namespace in the kind cluster. provider-local configures this registry as mirror for the shoot by mutating the OperatingSystemConfig and using the default contract for extending the containerd configuration.
In order to bootstrap a seed cluster, the gardenlet deploys PersistentVolumeClaims and Services of type LoadBalancer. While storage is supported in shoot clusters by using the local-path-provisioner, load balancers are not supported yet. However, provider-local runs a Service controller which specifically reconciles the seed-related Services of type LoadBalancer. This way, they get an IP and gardenlet can finish its bootstrapping process. Note that these IPs are not reachable, however for the sake of developing ManagedSeeds this is sufficient for now.
Also, please note that the provider-local extension only gets deployed because of the Always deployment policy in its corresponding ControllerRegistration and because the DNS provider type of the seed is set to local.
Implementation Details
This section contains information about how the respective controllers and webhooks in provider-local are implemented and what their purpose is.
Bootstrapping
The Helm chart of the provider-local extension defined in its Extension contains a special deployment for a CoreDNS instance in a gardener-extension-provider-local-coredns namespace in the seed cluster.
This CoreDNS instance is responsible for enabling the components running in the shoot clusters to be able to resolve the DNS names when they communicate with their kube-apiservers.
It contains a static configuration to resolve the DNS names based on local.gardener.cloud to istio-ingressgateway.istio-ingress.svc.
Controllers
There are controllers for all resources in the extensions.gardener.cloud/v1alpha1 API group except for BackupBucket and BackupEntrys.
ControlPlane
This controller is deploying the local-path-provisioner as well as a related StorageClass in order to support PersistentVolumeClaims in the local shoot cluster. Additionally, it creates a few (currently unused) dummy secrets (CA, server and client certificate, basic auth credentials) for the sake of testing the secrets manager integration in the extensions library.
DNSRecord
The controller adapts the cluster internal DNS configuration by extending the custom CoreDNS configuration for every observed DNSRecord. It adds two corresponding entries in the custom DNS configuration per shoot cluster containing a rewrite rule:
data:
api.local.local.external.local.gardener.cloud.override: |
rewrite stop name regex api.local.local.external.local.gardener.cloud istio-ingressgateway.istio-ingress.svc.cluster.local answer auto
api.local.local.internal.local.gardener.cloud.override: |
rewrite stop name regex api.local.local.internal.local.gardener.cloud istio-ingressgateway.istio-ingress.svc.cluster.local answer autoFor autonomous shoots, the controller configures the CoreDNS template plugin instead of using a rewrite rule for each DNSRecord:
data:
api.root.garden.local.gardener.cloud.override: |
template IN A local.gardener.cloud {
match "^api\.root\.garden\.local\.gardener\.cloud\.$"
answer "{{ .Name }} 120 IN A 10.0.130.192"
fallthrough
}Infrastructure
This controller generates a NetworkPolicy which allows the control plane pods (like kube-apiserver) to communicate with the worker machine pods (see Worker section). It also deploys a NetworkPolicy which allows the bastion pods to communicate with the worker machine pods (see Bastion section).
Network
This controller is not implemented anymore. In the initial version of provider-local, there was a Network controller deploying kindnetd (see release v1.44.1). However, we decided to drop it because this setup prevented us from using NetworkPolicys (kindnetd does not ship a NetworkPolicy controller). In addition, we had issues with shoot clusters having more than one node (hence, we couldn't support rolling updates, see PR #5666).
OperatingSystemConfig
This controller renders a simple cloud-init template which can later be executed by the shoot worker nodes.
The shoot worker nodes are Pods with a container based on the kindest/node image. This is maintained in the gardener/machine-controller-manager-provider-local repository and has a special run-userdata systemd service which executes the cloud-init generated earlier by the OperatingSystemConfig controller.
Worker
This controller leverages the standard generic Worker actuator in order to generate the MachineClasses and the MachineDeployments based on the specification of the Worker resource.
Additionally, the controller deploys RBAC objects for granting machine-controller-manager additional permissions in the control plane namespace. This is needed because the local machine provider creates Kubernetes objects in the control plane namespace for starting Machines, which is different to all other machine provider implementations.
Bastion
This controller implements the Bastion.extensions.gardener.cloud resource by deploying a pod with the local machine image along with a LoadBalancer service.
Note that this controller does not respect the Bastion.spec.ingress configuration as there is no way to perform client IP restrictions in the local setup.
Ingress
The gardenlet creates a wildcard DNS record for the Seed's ingress domain pointing to the nginx-ingress-controller's LoadBalancer. This domain is commonly used by all Ingress objects created in the Seed for Seed and Shoot components. As provider-local implements the DNSRecord extension API (see the DNSRecordsection), this controller reconciles all Ingresss and creates DNSRecords of type local for each host included in spec.rules. This only happens for shoot namespaces (gardener.cloud/role=shoot label) to make Ingress domains resolvable on the machine pods.
Service
This controller reconciles Services of type LoadBalancer in the local Seed cluster. Since the local Kubernetes clusters used as Seed clusters typically don't support such services, this controller sets the .status.ingress.loadBalancer.ip[0] to the IP of the host. It makes important LoadBalancer Services (e.g. istio-ingress/istio-ingressgateway and shoot--*--*/bastion-*) available to the host by setting spec.ports[].nodePort to well-known ports that are mapped to hostPorts in the kind cluster configuration.
istio-ingress/istio-ingressgateway is set to be exposed on nodePort 30433 by this controller. The bastion services are exposed on nodePort 30022.
In case the seed has multiple availability zones (.spec.provider.zones) and it uses SNI, the different zone-specific istio-ingressgateway loadbalancers are exposed via different IP addresses. Per default, IP addresses 172.18.255.10, 172.18.255.11, and 172.18.255.12 are used for the zones 0, 1, and 2 respectively.
ETCD Backups
This controller reconciles the BackupBucket and BackupEntry of the shoot allowing the etcd-backup-restore to create and copy backups using the local provider functionality. The backups are stored on the host file system. This is achieved by mounting that directory to the etcd-backup-restore container.
Extension Seed
This controller reconciles Extensions of type local-ext-seed. It creates a single serviceaccount named local-ext-seed in the shoot's namespace in the seed. The extension is reconciled before the kube-apiserver. More on extension lifecycle strategies can be read in Registering Extension Controllers.
Extension Shoot
This controller reconciles Extensions of type local-ext-shoot. It creates a single serviceaccount named local-ext-shoot in the kube-system namespace of the shoot. The extension is reconciled after the kube-apiserver. More on extension lifecycle strategies can be read Registering Extension Controllers.
Extension Shoot After Worker
This controller reconciles Extensions of type local-ext-shoot-after-worker. It creates a deployment named local-ext-shoot-after-worker in the kube-system namespace of the shoot. The extension is reconciled after the workers and waits until the deployment is ready. More on extension lifecycle strategies can be read Registering Extension Controllers.
Health Checks
The health check controller leverages the health check library in order to:
- check the health of the
ManagedResource/extension-controlplane-shoot-webhooksand populate theSystemComponentsHealthycondition in theControlPlaneresource. - check the health of the
ManagedResource/extension-networking-localand populate theSystemComponentsHealthycondition in theNetworkresource. - check the health of the
ManagedResource/extension-worker-mcm-shootand populate theSystemComponentsHealthycondition in theWorkerresource. - check the health of the
Deployment/machine-controller-managerand populate theControlPlaneHealthycondition in theWorkerresource. - check the health of the
Nodes and populate theEveryNodeReadycondition in theWorkerresource.
Webhooks
Control Plane
This webhook reacts on the OperatingSystemConfig containing the configuration of the kubelet and sets the failSwapOn to false (independent of what is configured in the Shoot spec) (ref).
DNS Config
This webhook reacts on events for the dependency-watchdog-probe Deployment, the blackbox-exporter Deployment, as well as on events for Pods created when the machine-controller-manager reconciles Machines. All these pods need to be able to resolve the DNS names for shoot clusters. It sets the .spec.dnsPolicy=None and .spec.dnsConfig.nameServers to the cluster IP of the coredns Service created in the gardener-extension-provider-local-coredns namespaces so that these pods can resolve the DNS records for shoot clusters (see the Bootstrapping section for more details). It also adds the necessary NetworkPolicy labels to allow the communication to the coredns pods.
Node
This webhook reacts on updates to nodes/status in both seed and shoot clusters and sets the .status.{allocatable,capacity}.cpu="100" and .status.{allocatable,capacity}.memory="100Gi" fields.
Background: Typically, the .status.{capacity,allocatable} values are determined by the resources configured for the Docker daemon (see for example the docker Quick Start Guide for Mac). Since many of the Pods deployed by Gardener have quite high .spec.resources.requests, the Nodes easily get filled up and only a few Pods can be scheduled (even if they barely consume any of their reserved resources). In order to improve the user experience, on startup/leader election the provider-local extension submits an empty patch which triggers the "node webhook" (see the below section) for the seed cluster. The webhook will increase the capacity of the Nodes to allow all Pods to be scheduled. For the shoot clusters, this empty patch trigger is not needed since the MutatingWebhookConfiguration is reconciled by the ControlPlane controller and exists before the Node object gets registered.
Shoot
This webhook reacts on the ConfigMap used by the kube-proxy and sets the maxPerCore field to 0 since other values don't work well in conjunction with the kindest/node image which is used as base for the shoot worker machine pods (ref).
DNS Configuration for Multi-Zonal Seeds
In case a seed cluster has multiple availability zones as specified in .spec.provider.zones, multiple istio ingress gateways are deployed, one per availability zone in addition to the default deployment. The result is that single-zone shoot control planes, i.e. shoot clusters with .spec.controlPlane.highAvailability set or with .spec.controlPlane.highAvailability.failureTolerance.type set to node, may be exposed via any of the zone-specific istio ingress gateways. Previously, the endpoints were statically mapped via /etc/hosts. Unfortunately, this is no longer possible due to the aforementioned dynamic in the endpoint selection.
For multi-zonal seed clusters, there is an additional configuration following coredns's view plugin mapping the external IP addresses of the zone-specific loadbalancers to the corresponding internal istio ingress gateway domain names. This configuration is only in place for requests from outside of the seed cluster. Those requests are currently being identified by the protocol. UDP requests are interpreted as originating from within the seed cluster while TCP requests are assumed to come from outside the cluster via the docker hostport mapping.
The corresponding test sets the DNS configuration accordingly so that the name resolution during the test use coredns in the cluster.
machine-controller-manager-provider-local
Out of tree (controller-based) implementation for local as a new provider. The local out-of-tree provider implements the interface defined at MCM OOT driver.
For every Machine object, the local machine provider creates a Pod in the shoot control plane namespace. A machine pod uses an image based on kind's node image (kindest/node). The machine's user data is deployed as a Secret in the shoot control plane namespace and mounted into the machine pod. In contrast to kind, the local machine image doesn't directly run kubelet as a systemd unit. Instead, it has a unit for running the user data script at /etc/machine/userdata.
Typically, machines running in a cloud infrastructure environment can resolve the hostnames of other machines in the same cluster/network. To mimic this behavior in the local setup, the machine provider creates a Service for every Machine with the same name as the Pod. With this, local Nodes and Bastions can connect to other Nodes via their hostname.
Future Work
Future work could mostly focus on resolving the above listed limitations, i.e.:
- Implement a
cloud-controller-managerand deploy it via theControlPlanecontroller. - Properly implement
.spec.machineTypesin theCloudProfiles (i.e., configure.spec.resourcesproperly for the created shoot worker machine pods).