Gardener Logo
Deliver fully-managed clusters at scale everywhere with your own Kubernetes-as-a-Service
  • Shoot Reconciliation October 23, 2020

    Do you want to understand how Gardener creates and updates Kubernetes clusters (Shoots)? Well, it’s complicated, but if you are not afraid of large diagrams and are a visual learner like me, this might be useful to you.

    Introduction

    In this blog post I will share a technical diagram which attempts to tie together the various components involved when Gardener creates a Kubernetes cluster. I have created and curated the diagram, which visualizes the Shoot reconciliation flow since I started developing on Gardener. Aside from serving as a memory aid for myself, I created it in hopes that it may potentially help contributors to understand a core piece of the complex Gardener machinery. Please be advised that the diagram and components involved are large. Although it can be easily divided into multiple diagrams, I want to show all the components and connections in a single diagram to create an overview of the reconciliation flow.

    The goal is to visualize the interactions of the components involved in the Shoot creation. It is not intended to serve as a documentation of every component involved.

    Background

    Taking a step back, the Gardener READ.me states

    In essence, Gardener is an extension API server that comes along with a bundle of custom controllers. It introduces new API objects in an existing Kubernetes cluster (which is called garden cluster) in order to use them for the management of end-user Kubernetes clusters (which are called shoot clusters). These shoot clusters are described via declarative cluster specifications which are observed by the controllers. They will bring up the clusters, reconcile their state, perform automated updates and make sure they are always up and running.

    This means that Gardener, just like any Kubernetes controller, creates Kubernetes clusters (Shoots) using a reconciliation loop.

    The Gardenlet contains the controller and reconciliation loop responsible for the creation, update, deletion and migration of Shoot cluster (there are more, but we spare them in this article). In addition, the Gardener Controller Manager also reconciles Shoot resources, but only for seed-independent functionality such as Shoot hibernation, Shoot maintenance or quota control.

    This blog post is about the reconciliation loop in the Gardenlet responsible for creating and updating Shoot clusters. The code can be found here. The reconciliation loops of the extension controllers can be found in their individual repositories.

    Shoot reconciliation flow diagram

    When Gardner creates a Shoot cluster, there are three conceptual layers involved: the Garden cluster, the Seed cluster and the Shoot cluster. Each layer represents a top-level section in the diagram (similar to a lane in a BPMN diagram).

    It might seem confusing, that the Shoot cluster itself is a layer, because the whole flow in the first place is about creating the Shoot cluster. I decided to introduce this separate layer to make a clear distinction between which resources exist in the Seed API server (managed by Gardener) and which in the Shoot API server (accessible by the Shoot owner).

    Each section contains several components. Components are mostly Kubernetes resources in a Gardener installation (e.g. the gardenlet deployment in the Seed cluster).

    This is the list of components:

    (Virtual) Garden Cluster

    • Gardener Extension API server
    • Validating Provider Webhooks
    • Project Namespace

    Seed Cluster

    Shoot Cluster

    • Cloud Provider compute API (owned by Stakeholder) - for VM/Node creation.
    • VM / Bare metal node hosted by Cloud Provider (in Stakeholder owned account).

    How to use the diagram

    The diagram

    • should be read from top to bottom - starting in the top left corner with the creation of the Shoot resource via the Gardener Extension API server.
    • should not require an encompassing documentation / description. More detailed documentation on the components itself, can usually be found in the respective repository.
    • does not show which activities execute in parallel (many) and also does not describe the exact dependencies between the steps. This can be found out by looking at the source code. It however tries to put the activities in a logical order of executing during the reconciliation flow.

    Occasionally, there is an info box with additional information next to parts in the diagram that in my point of view require further explanation. Large example resource for the Gardener CRDs (e.g Worker CRD, Infrastructure CRD) are placed on the left side and are referenced by a dotted line (—–).

    Be aware, that Gardener is an evolving project, so the diagram will most likely be already outdated by the time you are reading this. Nevertheless, it should give a solid starting point for further explorations into the details of Gardener.

    Flow diagram

    The diagram can be found below and on Github.com. There are multiple formats available (svg, vsdx, draw.io, html).

    Please open an issue or open a PR in the repository if information is missing or is incorrect. Thanks!

  • STACKIT SKE with Gardener December 03, 2020

    STACKIT is a digital brand of Europe’s biggest retailer, the Schwarz Group, which consists of Lidl, Kaufland, as well as production and recycling companies. Following the industry trend, the Schwarz Group is in the process of a digital transformation. STACKIT enables this transformation by helping to modernize the internal IT of the company branches.

    What is STACKIT and the STACKIT Kubernetes Engine (SKE)?

    STACKIT started with colocation solutions for internal and external customers in Europe-based data centers, which was then expanded to a full cloud platform stack providing an IaaS layer with VMs, storage and network, as well as a PaaS layer including Cloud Foundry and a growing set of cloud services, like databases, messaging, etc.

    With containers and Kubernetes becoming the lingua franca of the cloud, we are happy to announce the STACKIT Kubernetes Engine (SKE), which has been released as Beta in November this year. We decided to use Gardener as the cluster management engine underneath SKE - for good reasons as you will see – and we would like to share our experiences with Gardener when working on the SKE Beta release, and serve as a testimonial for this technology.

    Figure 1: STACKIT Component Diagram

    Why we chose Gardener as a cluster management tool

    We started with the Kubernetes endeavor in the beginning of 2020 with a newly formed agile team that consisted of software engineers, highly experienced in IT operations and development. After some exploration and a short conceptual phase, we had a clear-cut opinion on how the cluster management for STACKIT should look like: we were looking for a highly customizable tool that could be adapted to the specific needs of STACKIT and the Schwarz Group, e.g. in terms of network setup or the infrastructure layer it should be running on. Moreover, the tool should be scalable to a high number of managed Kubernetes clusters and should therefore provide a fully automated operation experience. As an open source project, contributing and influencing the tool, as well as collaborating with a larger community were important aspects that motivated us. Furthermore, we aimed to offer cluster management as a self-service in combination with an excellent user experience. Our objective was to have the managed clusters come with enterprise-grade SLAs – i.e. with “batteries included”, as some say.

    With this mission, we started our quest through the world of Kubernetes and soon found Gardener to be a hot candidate of cluster management tools that seemed to fulfill our demands. We quickly got in contact and received a warm welcome from the Gardener community. As interested potential adopter, but in the early days of the COVID-19 lockdown, we managed to organize an online workshop during which we got an introduction and deep dive into Gardener and discussed the STACKIT use cases. We learned that Gardener is extensible in many dimensions, and that contributions are always welcome and encouraged. Once we understood the basic Gardener concepts of Garden, Shoot and Seed clusters, its inception design and how this extends Kubernetes concepts in a natural way, we were eager to evaluate this tool in more detail.

    After this evaluation, we were convinced that this tool fulfilled all our requirements - a decision was made and off we went.

    How Gardener was adapted and extended by SKE

    After becoming familiar with Gardener, we started to look into its code base to adapt it to the specific needs of the STACKIT OpenStack environment. Changes and extensions were made in order to get it integrated into the STACKIT environment, and whenever reasonable, we contributed those changes back:

    • To run smoothly with the STACKIT OpenStack layer, the Gardener configuration was adapted in different places, e.g. to support CSI driver or to configure the domains of shoot API server or ingress.
    • Gardener was extended to support shoots and shooted seeds in dual stack and dual home setup. This is used in SKE for the communication between shooted seeds and the Garden cluster.
    • SKE uses a private image registry for Gardener installation to resolve dependencies to public image registries and to have more control over the used Gardener versions. To install and run Gardener with the private image registry, some new configurations need to be introduced into Gardener.
    • Gardener is a first-class API based service what allowed us to smoothly integrate it into the STACKIT User Interface. We were able to jump-start and utilize also the Gardener Dashboard for our Beta release by merely adjusting the look-&-feel, i.e. colors, labels and icons.

    Figure 2: Gardener Dashboard adapted to STACKIT UI style

    Experience with Gardener operations

    As no OpenStack installation is identical to one another, getting Gardener to run stable on the STACKIT IaaS layer revealed some operational challenges. For instance, it was challenging to find the right configuration for Cinder CSI.

    To test for its resilience, we tried to break the managed clusters with a Chaos Monkey test, e.g. by deleting services or components needed by Kubernetes and Gardener to work properly. The reconciliation feature of Gardener fixed all those problems automatically, so that damaged Shoot clusters became operational again after a short period of time. Thus, we were not able to break Shoot clusters from an end user perspective permanently, despite our efforts. Which again speaks for Gardener’s first-class cloud native design.

    We also participated in a fruitful community support: For several challenges we contacted the community channel and help was provided in a timely manner. A lesson learned was that raising an issue in the community early on, before getting stuck too long on your own with an unresolved problem, is essential and efficient.

    Summary

    Gardener is used by SKE to provide a managed Kubernetes offering for internal use cases of the Schwarz Group as well as for the public cloud offering of STACKIT. Thanks to Gardener, it was possible to get from zero to a Beta release in only about half a year’s time – this speaks for itself. Within this period, we were able to integrate Gardener into the STACKIT environment, i.e. in its OpenStack IaaS layer, its management tools and its identity provisioning solution.

    Gardener has become a vital building block in STACKIT’s cloud native platform offering. For the future, the possibility to manage clusters also on other infrastructures and hyperscalers is seen as another great opportunity for extended use cases. The open co-innovation exchange with the Gardener community member companies has also opened the door to commercial co-operation.

Make It All About Kubernetes Again

Gardener abstracts environment specifics to deliver the same homogeneous Kubernetes-native DevOps experience everywhere
Cluster Fleet Hub
A single Gardener can scale to register and manage thousands of clusters, regardless of their location - public/private clouds, DC bare metal, regulated environments... anywhere a Gardenlet is deployed.
Kubernetes Native
Gardener manages clusters very much like pods are orchestrated in Kubernetes. Cluster workloads are scheduled and Gardenlets, similar to Kubelets, take over to manage them in particular environments in a loosely coupled, controller pattern.
Fully Managed
Gardenlets manage control planes, worker nodes (full lifecycle, self-healing and updates) and cluster components, such as the overlay network, DNS and certificates, control plane monitoring and logging stack, to provide automation, resilience and observability.
Scalable by Design
A single Kubernetes cluster can host an enormous amount of control planes. Gardener can scale-out massively by more control plane clusters and letting the Gardenlets do the heavy lifting. In fact, those clusters can also be managed by Gardener for maximum efficency.
Learn more about the concepts behind Gardener

Get The Kubernetes You Really Want

The clusters Gardener provisions are as flexible as DIY clusters, except you don’t have to do them yourself
Configurable

Gardener control planes allow you to control a wide range of a wide range of features gates and configurations.

The updates you want, when you want them

No more unexpected updates! Gardener allows you to update Kubernetes to the version you want, when you want it, rather than when your cloud provider decides. It even allows you to update your Host OS when desired.

100% Kubernetes compliant

Gardener is Kubernetes native and is not shy to be completely transparent on its compliance, proudly holding the 100% badge with public evidence for that.

The one you already know

Gardener delivers the same Kubernetes you know from kubernetes.io and are certified for. The same binaries, the same tools; you are already trained to use it.

Everywhere You Want It

The compute resources you need, wherever you want them.
  • Alibaba Cloud
  • Amazon Web Services
  • Microsoft Azure
  • Google Cloud Platform
  • Metal-Stack
  • OpenStack
  • Equinix Metal
  • VMware vSphere
New infrastructure use case? Let’s build it together

However You Want It

Extend And Contribute To Gardener
Extensible By Design

Gardener is a modular system of managed extensions around a robust core, fully adaptable in multiple dimensions. Extend the existing extension set or add completely new pieces. And while you are at it, why not contribute them back to the community and benefit from contributions of others?

Managed Extensions

Gardener watches over and manages extensions, automatically reconciling their actual and desired state as designed.

Control the Stack

You are in control of the setup for the cluster that will be delivered by Gardener. Choose the components you actually need.

Learn more about Gardener’s extensibility