그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그
5 minute read
Every Kubernetes cluster requires some low-level infrastructure to be setup in order to work properly.
Examples for that are networks, routing entries, security groups, IAM roles, etc.
Before introducing the
Infrastructure extension resource Gardener was using Terraform in order to create and manage these provider-specific resources (e.g., see here).
Now, Gardener commissions an external, provider-specific controller to take over this task.
Which infrastructure resources are required?
Unfortunately, there is no general answer to this question as it is highly provider specific. Consider the above mentioned resources, i.e. VPC, subnets, route tables, security groups, IAM roles, SSH key pairs. Most of the resources are required in order to create VMs (the shoot cluster worker nodes), load balancers, and volumes.
What needs to be implemented to support a new infrastructure provider?
As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:
--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: type: azure region: eu-west-1 secretRef: name: cloudprovider namespace: shoot--foo--bar providerConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig resourceGroup: name: mygroup networks: vnet: # specify either 'name' or 'cidr' # name: my-vnet cidr: 10.250.0.0/16 workers: 10.250.0.0/19
.spec.secretRef contains a reference to the provider secret pointing to the account that shall be used to create the needed resources.
However, the most important section is the
It contains an embedded declaration of the provider specific configuration for the infrastructure (that cannot be known by Gardener itself).
You are responsible for designing how this configuration looks like.
Gardener does not evaluate it but just copies this part from what has been provided by the end-user in the
After your controller has created the required resources in your provider’s infrastructure it needs to generate an output that can be used by other controllers in subsequent steps.
An example for that is the
Worker extension resource controller.
It is responsible for creating virtual machines (shoot worker nodes) in this prepared infrastructure.
Everything that it needs to know in order to do that (e.g. the network IDs, security group names, etc. (again: provider-specific)) needs to be provided as output in the
--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: ... status: lastOperation: ... providerStatus: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus resourceGroup: name: mygroup networks: vnet: name: my-vnet subnets: - purpose: nodes name: my-subnet availabilitySets: - purpose: nodes id: av-set-id name: av-set-name routeTables: - purpose: nodes name: route-table-name securityGroups: - purpose: nodes name: sec-group-name
In order to support a new infrastructure provider you need to write a controller that watches all
You can take a look at the below referenced example implementation for the Azure provider.
Dynamic nodes network for shoot clusters
Some environments do not allow end-users to statically define a CIDR for the network that shall be used for the shoot worker nodes.
In these cases it is possible for the extension controllers to dynamically provision a network for the nodes (as part of their reconciliation loops), and to provide the CIDR in the
status of the
--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: ... status: lastOperation: ... providerStatus: ... nodesCIDR: 10.250.0.0/16
Gardener will pick this
nodesCIDR and use it to configure the VPN components to establish network connectivity between the control plane and the worker nodes.
Shoot resource already specifies a nodes CIDR in
.spec.networking.nodes and the extension controller provides also a value in
.status.nodesCIDR in the
Infrastructure resource then the latter one will always be considered with higher priority by Gardener.
Non-provider specific information required for infrastructure creation
Some providers might require further information that is not provider specific but already part of the shoot resource.
One example for this is the GCP infrastructure controller which needs the pod and the service network of the cluster in order to prepare and configure the infrastructure correctly.
As Gardener cannot know which information is required by providers it simply mirrors the
CloudProfile resources into the seed.
They are part of the
Cluster extension resource and can be used to extract information that is not part of the
Infrastructure resource itself.
Most existing infrastructure controller implementations follow a common pattern where a generic
Reconciler delegates to an
Actuator interface that contains the methods
Restore. These methods are called by the generic
Reconciler for the respective operations, and should be implemented by the extension according to the contract described here and the migration guidelines.
For infrastructure controllers, the generic
Reconciler also delegates to a
ConfigValidator interface that contains a single
Validate method. This method is called by the generic
Reconciler at the beginning of every reconciliation, and can be implemented by the extension to validate the
.spec.providerConfig part of the
Infrastructure resource with the respective cloud provider, typically the existence and validity of cloud provider resources such as AWS VPCs or GCP Cloud NAT IPs.
Validate method returns a list of errors. If this list is non-empty, the generic
Reconciler will fail with an error. This error will have the error code
ERR_CONFIGURATION_PROBLEM, unless there is at least one error in the list that has its
ErrorType field set to