3 - Dual Stack Ingress
Using IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster
Motivation
IPv6 adoption is continuously growing, already overtaking IPv4 in certain regions, e.g. India, or scenarios, e.g. mobile.
Even though most IPv6 installations deploy means to reach IPv4, it might still be beneficial to expose services
natively via IPv4 and IPv6 instead of just relying on IPv4.
Disadvantages of full IPv4/IPv6 (dual-stack) Deployments
Enabling full IPv4/IPv6 (dual-stack) support in a kubernetes cluster is a major endeavor. It requires a lot of changes
and restarts of all pods so that all pods get addresses for both IP families. A side-effect of dual-stack networking
is that failures may be hidden as network traffic may take the other protocol to reach the target. For this reason and
also due to reduced operational complexity, service teams might lean towards staying in a single-stack environment as
much as possible. Luckily, this is possible with Gardener and IPv4/IPv6 (dual-stack) ingress on AWS.
Simplifying IPv4/IPv6 (dual-stack) Ingress with Protocol Translation on AWS
Fortunately, the network load balancer on AWS supports automatic protocol translation, i.e. it can expose both IPv4 and
IPv6 endpoints while communicating with just one protocol to the backends. Under the hood, automatic protocol translation
takes place. Client IP address preservation can be achieved by using proxy protocol.
This approach enables users to expose IPv4 workload to IPv6-only clients without having to change the workload/service.
Without requiring invasive changes, it allows a fairly simple first step into the IPv6 world for services just requiring
ingress (incoming) communication.
Necessary Shoot Cluster Configuration Changes for IPv4/IPv6 (dual-stack) Ingress
To be able to utilize IPv4/IPv6 (dual-stack) Ingress in an IPv4 shoot cluster, the cluster needs to meet two preconditions:
dualStack.enabled
needs to be set to true
to configure VPC/subnet for IPv6 and add a routing rule for IPv6.
(This does not add IPv6 addresses to kubernetes nodes.)loadBalancerController.enabled
needs to be set to true
as well to use the load balancer controller, which supports
dual-stack ingress.
apiVersion: core.gardener.cloud/v1beta1
kind: Shoot
...
spec:
provider:
type: aws
infrastructureConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
dualStack:
enabled: true
controlPlaneConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: ControlPlaneConfig
loadBalancerController:
enabled: true
...
When infrastructureConfig.networks.vpc.id
is set to the ID of an existing VPC, please make sure that your VPC has an Amazon-provided IPv6 CIDR block added.
After adapting the shoot specification and reconciling the cluster, dual-stack load balancers can be created using
kubernetes services objects.
Creating an IPv4/IPv6 (dual-stack) Ingress
With the preconditions set, creating an IPv4/IPv6 load balancer is as easy as annotating a service with the correct
annotations:
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/aws-load-balancer-ip-address-type: dualstack
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance
service.beta.kubernetes.io/aws-load-balancer-type: external
name: ...
namespace: ...
spec:
...
type: LoadBalancer
In case the client IP address should be preserved, the following annotation can be used to enable proxy protocol.
(The pod receiving the traffic needs to be configured for proxy protocol as well.)
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
Please note that changing an existing Service
to dual-stack may cause the creation of a new load balancer without
deletion of the old AWS load balancer resource. While this helps in a seamless migration by not cutting existing
connections it may lead to wasted/forgotten resources. Therefore, the (manual) cleanup needs to be taken into account
when migrating an existing Service
instance.
For more details see AWS Load Balancer Documentation - Network Load Balancer.
DNS Considerations to Prevent Downtime During a Dual-Stack Migration
In case the migration of an existing service is desired, please check if there are DNS entries directly linked to the
corresponding load balancer. The migrated load balancer will have a new domain name immediately, which will not be ready
in the beginning. Therefore, a direct migration of the domain name entries is not desired as it may cause a short
downtime, i.e. domain name entries without backing IP addresses.
If there are DNS entries directly linked to the corresponding load balancer and they are managed by the
shoot-dns-service, you can identify this via
annotations with the prefix dns.gardener.cloud/
. Those annotations can be linked to a Service
, Ingress
or
Gateway
resources. Alternatively, they may also use DNSEntry
or DNSAnnotation
resources.
For a seamless migration without downtime use the following three step approach:
- Temporarily prevent direct DNS updates
- Migrate the load balancer and wait until it is operational
- Allow DNS updates again
To prevent direct updates of the DNS entries when the load balancer is migrated add the annotation
dns.gardener.cloud/ignore: 'true'
to all affected resources next to the other dns.gardener.cloud/...
annotations
before starting the migration. For example, in case of a Service
ensure that the service looks like the following:
kind: Service
metadata:
annotations:
dns.gardener.cloud/ignore: 'true'
dns.gardener.cloud/class: garden
dns.gardener.cloud/dnsnames: '...'
...
Next, migrate the load balancer to be dual-stack enabled by adding/changing the corresponding annotations.
You have multiple options how to check that the load balancer has been provisioned successfully. It might be useful
to peek into status.loadBalancer.ingress
of the corresponding Service
to identify the load balancer:
- Check in the AWS console for the corresponding load balancer provisioning state
- Perform domain name lookups with
nslookup
/dig
to check whether the name resolves to an IP address. - Call your workload via the new load balancer, e.g. using
curl --resolve <my-domain-name>:<port>:<IP-address> https://<my-domain-name>:<port>
, which allows you to call your
service with the “correct” domain name without using actual name resolution. - Wait a fixed period of time as load balancer creation is usually finished within 15 minutes
Once the load balancer has been provisioned, you can remove the annotation dns.gardener.cloud/ignore: 'true'
again
from the affected resources. It may take some additional time until the domain name change finally propagates
(up to one hour).
4 - Ipv6
Support for IPv6
Overview
Gardener supports different levels of IPv6 support in shoot clusters.
This document describes the differences between them and what to consider when using them.
In IPv6 Ingress for IPv4 Shoot Clusters, the focus is on how an existing IPv4-only shoot cluster can provide dual-stack services to clients.
Section IPv6-only Shoot Clusters describes how to create a shoot cluster that only supports IPv6.
Finally, Dual-Stack Shoot Clusters explains how to create a shoot cluster that supports both IPv4 and IPv6.
IPv6 Ingress for IPv4 Shoot Clusters
Per default, Gardener shoot clusters use only IPv4.
Therefore, they also expose their services only via load balancers with IPv4 addresses.
To allow external clients to also use IPv6 to access services in an IPv4 shoot cluster, the cluster needs to be configured to support dual-stack ingress.
It is possible to configure a shoot cluster to support dual-stack ingress, see Using IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster for more information.
The main benefit of this approach is that the existing cluster stays almost as is without major changes, keeping the operational simplicity.
It works very well for services that only require incoming communication, e.g. pure web services.
The main drawback is that certain scenarios, especially related to IPv6 callbacks, are not possible.
This means that services, which actively call to their clients via web hooks, will not be able to do so over IPv6.
Hence, those services will not be able to allow full-usage via IPv6.
IPv6-only Shoot Clusters
Motivation
IPv6-only shoot clusters are the best option to verify that services are fully IPv6-compatible.
While Dual-Stack Shoot Clusters may fall back on using IPv4 transparently, IPv6-only shoot clusters enforce the usage of IPv6 inside the cluster.
Therefore, it is recommended to check with IPv6-only shoot clusters if a workload is fully IPv6-compatible.
In addition to being a good testbed for IPv6 compatibility, IPv6-only shoot clusters may also be a desirable eventual target in the IPv6 migration as they allow to support both IPv4 and IPv6 clients while having a single-stack with the cluster.
Creating an IPv6-only Shoot Cluster
To create an IPv6-only shoot cluster, the following needs to be specified in the Shoot
resource (see also here):
kind: Shoot
apiVersion: core.gardener.cloud/v1beta1
metadata:
...
spec:
...
networking:
type: ...
ipFamilies:
- IPv6
...
provider:
type: aws
infrastructureConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
vpc:
cidr: 192.168.0.0/16
zones:
- name: ...
public: 192.168.32.0/20
internal: 192.168.48.0/20
Please note that nodes
, pods
and services
should not be specified in .spec.networking
resource.
In contrast to that, it is still required to specify IPv4 ranges for the VPC and the public/internal subnets.
This is mainly due to the fact that public/internal load balancers still require IPv4 addresses as there are no pure IPv6-only load balancers as of now.
The ranges can be sized according to the expected amount of load balancers per zone/type.
The IPv6 address ranges are provided by AWS. It is ensured that the IPv6 ranges are globally unique und internet routable.
Load Balancer Configuration
The AWS Load Balancer Controller is automatically deployed when using an IPv6-only shoot cluster.
When creating a load balancer, the corresponding annotations need to be configured, see AWS Load Balancer Documentation - Network Load Balancer for details.
The AWS Load Balancer Controller allows dual-stack ingress so that an IPv6-only shoot cluster can serve IPv4 and IPv6 clients.
You can find an example here.
When accessing Network Load Balancers (NLB) from within the same IPv6-only cluster, it is crucial to add the annotation service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=false
.
Without this annotation, if a request is routed by the NLB to the same target instance from which it originated, the client IP and destination IP will be identical.
This situation, known as the hair-pinning effect, will prevent the request from being processed.
(This also happens for internal load balancers in IPv4 clusters, but is mitigated by the NAT gateway for external IPv4 load balancers.)
Connectivity to IPv4-only Services
The IPv6-only shoot cluster can connect to IPv4-only services via DNS64/NAT64.
The cluster is configured to use the DNS64/NAT64 service of the underlying cloud provider.
This allows the cluster to resolve IPv4-only DNS names and to connect to IPv4-only services.
Please note that traffic going through NAT64 incurs the same cost as ordinary NAT traffic in an IPv4-only cluster.
Therefore, it might be beneficial to prefer IPv6 for services, which provide IPv4 and IPv6.
Dual-Stack Shoot Clusters
Motivation
Dual-stack shoot clusters support IPv4 and IPv6 out-of-the-box.
They can be the intermediate step on the way towards IPv6 for any existing (IPv4-only) clusters.
Creating a Dual-Stack Shoot Cluster
To create a dual-stack shoot cluster, the following needs to be specified in the Shoot
resource:
kind: Shoot
apiVersion: core.gardener.cloud/v1beta1
metadata:
...
spec:
...
networking:
type: ...
pods: 192.168.128.0/17
nodes: 192.168.0.0/18
services: 192.168.64.0/18
ipFamilies:
- IPv4
- IPv6
...
provider:
type: aws
infrastructureConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
vpc:
cidr: 192.168.0.0/18
zones:
- name: ...
workers: 192.168.0.0/19
public: 192.168.32.0/20
internal: 192.168.48.0/20
Please note that the only change compared to an IPv4-only shoot cluster is the addition of IPv6
to the .spec.networking.ipFamilies
field.
The order of the IP families defines the preference of the IP family.
In this case, IPv4 is preferred over IPv6, e.g. services specifying no IP family will get only an IPv4 address.
Migration of IPv4-only Shoot Clusters to Dual-Stack
Eventually, migration should be as easy as changing the .spec.networking.ipFamilies
field in the Shoot
resource from IPv4
to IPv4, IPv6
.
However, as of now, this is not supported.
It is worth recognizing that the migration from an IPv4-only shoot cluster to a dual-stack shoot cluster involves rolling of the nodes/workload as well.
Nodes will not get a new IPv6 address assigned automatically.
The same is true for pods as well.
Once the migration is supported, the detailed caveats will be documented here.
Load Balancer Configuration
The AWS Load Balancer Controller is automatically deployed when using a dual-stack shoot cluster.
When creating a load balancer, the corresponding annotations need to be configured, see AWS Load Balancer Documentation - Network Load Balancer for details.
Please note that load balancer services without any special annotations will default to IPv4-only regardless how .spec.ipFamilies
is set.
The AWS Load Balancer Controller allows dual-stack ingress so that a dual-stack shoot cluster can serve IPv4 and IPv6 clients.
You can find an example here.
When accessing external Network Load Balancers (NLB) from within the same cluster via IPv6 or internal NLBs via IPv4, it is crucial to add the annotation service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=false
.
Without this annotation, if a request is routed by the NLB to the same target instance from which it originated, the client IP and destination IP will be identical.
This situation, known as the hair-pinning effect, will prevent the request from being processed.
6 - Operations
Using the AWS provider extension with Gardener as operator
The core.gardener.cloud/v1beta1.CloudProfile
resource declares a providerConfig
field that is meant to contain provider-specific configuration.
Similarly, the core.gardener.cloud/v1beta1.Seed
resource is structured.
Additionally, it allows to configure settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.
This document explains what is necessary to configure for this provider extension.
CloudProfile
resource
In this section we are describing how the configuration for CloudProfile
s looks like for AWS and provide an example CloudProfile
manifest with minimal configuration that you can use to allow creating AWS shoot clusters.
CloudProfileConfig
The cloud profile configuration contains information about the real machine image IDs in the AWS environment (AMIs).
You have to map every version that you specify in .spec.machineImages[].versions
here such that the AWS extension knows the AMI for every version you want to offer.
For each AMI an architecture
field can be specified which specifies the CPU architecture of the machine on which given machine image can be used.
An example CloudProfileConfig
for the AWS extension looks as follows:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: CloudProfileConfig
machineImages:
- name: coreos
versions:
- version: 2135.6.0
regions:
- name: eu-central-1
ami: ami-034fd8c3f4026eb39
# architecture: amd64 # optional
Example CloudProfile
manifest
Please find below an example CloudProfile
manifest:
apiVersion: core.gardener.cloud/v1beta1
kind: CloudProfile
metadata:
name: aws
spec:
type: aws
kubernetes:
versions:
- version: 1.27.3
- version: 1.26.8
expirationDate: "2022-10-31T23:59:59Z"
machineImages:
- name: coreos
versions:
- version: 2135.6.0
machineTypes:
- name: m5.large
cpu: "2"
gpu: "0"
memory: 8Gi
usable: true
volumeTypes:
- name: gp2
class: standard
usable: true
- name: io1
class: premium
usable: true
regions:
- name: eu-central-1
zones:
- name: eu-central-1a
- name: eu-central-1b
- name: eu-central-1c
providerConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: CloudProfileConfig
machineImages:
- name: coreos
versions:
- version: 2135.6.0
regions:
- name: eu-central-1
ami: ami-034fd8c3f4026eb39
# architecture: amd64 # optional
Seed
resource
This provider extension does not support any provider configuration for the Seed
’s .spec.provider.providerConfig
field.
However, it supports to manage backup infrastructure, i.e., you can specify configuration for the .spec.backup
field.
Backup configuration
Please find below an example Seed
manifest (partly) that configures backups.
As you can see, the location/region where the backups will be stored can be different to the region where the seed cluster is running.
apiVersion: v1
kind: Secret
metadata:
name: backup-credentials
namespace: garden
type: Opaque
data:
accessKeyID: base64(access-key-id)
secretAccessKey: base64(secret-access-key)
---
apiVersion: core.gardener.cloud/v1beta1
kind: Seed
metadata:
name: my-seed
spec:
provider:
type: aws
region: eu-west-1
backup:
provider: aws
region: eu-central-1
secretRef:
name: backup-credentials
namespace: garden
...
Please look up https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys as well.
Permissions for AWS IAM user
Please make sure that the provided credentials have the correct privileges. You can use the following AWS IAM policy document and attach it to the IAM user backed by the credentials you provided (please check the official AWS documentation as well):
Click to expand the AWS IAM policy document!
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": "*"
}
]
}
7 - Usage
Using the AWS provider extension with Gardener as an end-user
The core.gardener.cloud/v1beta1.Shoot
resource declares a few fields that are meant to contain provider-specific configuration.
In this document we are describing how this configuration looks like for AWS and provide an example Shoot
manifest with minimal configuration that you can use to create an AWS cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).
Provider Secret Data
Every shoot cluster references a SecretBinding
or a CredentialsBinding
which itself references a Secret
, and this Secret
contains the provider credentials of your AWS account.
This Secret
must look as follows:
apiVersion: v1
kind: Secret
metadata:
name: core-aws
namespace: garden-dev
type: Opaque
data:
accessKeyID: base64(access-key-id)
secretAccessKey: base64(secret-access-key)
The AWS documentation explains the necessary steps to enable programmatic access, i.e. create access key ID and access key, for the user of your choice.
⚠️ For security reasons, we recommend creating a dedicated user with programmatic access only. Please avoid re-using a IAM user which has access to the AWS console (human user).
⚠️ Depending on your AWS API usage it can be problematic to reuse the same AWS Account for different Shoot clusters in the same region due to rate limits. Please consider spreading your Shoots over multiple AWS Accounts if you are hitting those limits.
Permissions
Please make sure that the provided credentials have the correct privileges. You can use the following AWS IAM policy document and attach it to the IAM user backed by the credentials you provided (please check the official AWS documentation as well):
Click to expand the AWS IAM policy document!
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "autoscaling:*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "elasticloadbalancing:*",
"Resource": "*"
},
{
"Action": [
"iam:GetInstanceProfile",
"iam:GetPolicy",
"iam:GetPolicyVersion",
"iam:GetRole",
"iam:GetRolePolicy",
"iam:ListPolicyVersions",
"iam:ListRolePolicies",
"iam:ListAttachedRolePolicies",
"iam:ListInstanceProfilesForRole",
"iam:CreateInstanceProfile",
"iam:CreatePolicy",
"iam:CreatePolicyVersion",
"iam:CreateRole",
"iam:CreateServiceLinkedRole",
"iam:AddRoleToInstanceProfile",
"iam:AttachRolePolicy",
"iam:DetachRolePolicy",
"iam:RemoveRoleFromInstanceProfile",
"iam:DeletePolicy",
"iam:DeletePolicyVersion",
"iam:DeleteRole",
"iam:DeleteRolePolicy",
"iam:DeleteInstanceProfile",
"iam:PutRolePolicy",
"iam:PassRole",
"iam:UpdateAssumeRolePolicy"
],
"Effect": "Allow",
"Resource": "*"
},
// The following permission set is only needed, if AWS Load Balancer controller is enabled (see ControlPlaneConfig)
{
"Effect": "Allow",
"Action": [
"cognito-idp:DescribeUserPoolClient",
"acm:ListCertificates",
"acm:DescribeCertificate",
"iam:ListServerCertificates",
"iam:GetServerCertificate",
"waf-regional:GetWebACL",
"waf-regional:GetWebACLForResource",
"waf-regional:AssociateWebACL",
"waf-regional:DisassociateWebACL",
"wafv2:GetWebACL",
"wafv2:GetWebACLForResource",
"wafv2:AssociateWebACL",
"wafv2:DisassociateWebACL",
"shield:GetSubscriptionState",
"shield:DescribeProtection",
"shield:CreateProtection",
"shield:DeleteProtection"
],
"Resource": "*"
}
]
}
InfrastructureConfig
The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.
An example InfrastructureConfig
for the AWS extension looks as follows:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
enableECRAccess: true
dualStack:
enabled: false
networks:
vpc: # specify either 'id' or 'cidr'
# id: vpc-123456
cidr: 10.250.0.0/16
# gatewayEndpoints:
# - s3
zones:
- name: eu-west-1a
internal: 10.250.112.0/22
public: 10.250.96.0/22
workers: 10.250.0.0/19
# elasticIPAllocationID: eipalloc-123456
ignoreTags:
keys: # individual ignored tag keys
- SomeCustomKey
- AnotherCustomKey
keyPrefixes: # ignored tag key prefixes
- user.specific/prefix/
The enableECRAccess
flag specifies whether the AWS IAM role policy attached to all worker nodes of the cluster shall contain permissions to access the Elastic Container Registry of the respective AWS account.
If the flag is not provided it is defaulted to true
.
Please note that if the iamInstanceProfile
is set for a worker pool in the WorkerConfig
(see below) then enableECRAccess
does not have any effect.
It only applies for those worker pools whose iamInstanceProfile
is not set.
Click to expand the default AWS IAM policy document used for the instance profiles!
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances"
],
"Resource": [
"*"
]
},
// Only if `.enableECRAccess` is `true`.
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:GetRepositoryPolicy",
"ecr:DescribeRepositories",
"ecr:ListImages",
"ecr:BatchGetImage"
],
"Resource": [
"*"
]
}
]
}
The dualStack.enabled
flag specifies whether dual-stack or IPv4-only should be supported by the infrastructure.
When the flag is set to true an Amazon provided IPv6 CIDR block will be attached to the VPC.
All subnets will receive a /64
block from it and a route entry is added to the main route table to route all IPv6 traffic over the IGW.
The networks.vpc
section describes whether you want to create the shoot cluster in an already existing VPC or whether to create a new one:
- If
networks.vpc.id
is given then you have to specify the VPC ID of the existing VPC that was created by other means (manually, other tooling, …).
Please make sure that the VPC has attached an internet gateway - the AWS controller won’t create one automatically for existing VPCs. To make sure the nodes are able to join and operate in your cluster properly, please make sure that your VPC has enabled DNS Support, explicitly the attributes enableDnsHostnames
and enableDnsSupport
must be set to true
. - If
networks.vpc.cidr
is given then you have to specify the VPC CIDR of a new VPC that will be created during shoot creation.
You can freely choose a private CIDR range. - Either
networks.vpc.id
or networks.vpc.cidr
must be present, but not both at the same time. networks.vpc.gatewayEndpoints
is optional. If specified then each item is used as service name in a corresponding Gateway VPC Endpoint.
The networks.zones
section contains configuration for resources you want to create or use in availability zones.
For every zone, the AWS extension creates three subnets:
For every subnet, you have to specify a CIDR range contained in the VPC CIDR specified above, or the VPC CIDR of your already existing VPC.
You can freely choose these CIDRs and it is your responsibility to properly design the network layout to suit your needs.
Also, the AWS extension creates a dedicated NAT gateway for each zone.
By default, it also creates a corresponding Elastic IP that it attaches to this NAT gateway and which is used for egress traffic.
The elasticIPAllocationID
field allows you to specify the ID of an existing Elastic IP allocation in case you want to bring your own.
If provided, no new Elastic IP will be created and, instead, the Elastic IP specified by you will be used.
⚠️ If you change this field for an already existing infrastructure then it will disrupt egress traffic while AWS applies this change.
The reason is that the NAT gateway must be recreated with the new Elastic IP association.
Also, please note that the existing Elastic IP will be permanently deleted if it was earlier created by the AWS extension.
You can configure Gateway VPC Endpoints by adding items in the optional list networks.vpc.gatewayEndpoints
. Each item in the list is used as a service name and a corresponding endpoint is created for it. All created endpoints point to the service within the cluster’s region. For example, consider this (partial) shoot config:
spec:
region: eu-central-1
provider:
type: aws
infrastructureConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
vpc:
gatewayEndpoints:
- s3
The service name of the S3 Gateway VPC Endpoint in this example is com.amazonaws.eu-central-1.s3
.
If you want to use multiple availability zones then add a second, third, … entry to the networks.zones[]
list and properly specify the AZ name in networks.zones[].name
.
Apart from the VPC and the subnets the AWS extension will also create DHCP options and an internet gateway (only if a new VPC is created), routing tables, security groups, elastic IPs, NAT gateways, EC2 key pairs, IAM roles, and IAM instance profiles.
The ignoreTags
section allows to configure which resource tags on AWS resources managed by Gardener should be ignored during
infrastructure reconciliation. By default, all tags that are added outside of Gardener’s
reconciliation will be removed during the next reconciliation. This field allows users and automation to add
custom tags on AWS resources created and managed by Gardener without loosing them on the next reconciliation.
Tags can be ignored either by specifying exact key values (ignoreTags.keys
) or key prefixes (ignoreTags.keyPrefixes
).
In both cases it is forbidden to ignore the Name
tag or any tag starting with kubernetes.io
or gardener.cloud
.
Please note though, that the tags are only ignored on resources created on behalf of the Infrastructure
CR (i.e. VPC,
subnets, security groups, keypair, etc.), while tags on machines, volumes, etc. are not in the scope of this controller.
ControlPlaneConfig
The control plane configuration mainly contains values for the AWS-specific control plane components.
Today, the only component deployed by the AWS extension is the cloud-controller-manager
.
An example ControlPlaneConfig
for the AWS extension looks as follows:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: ControlPlaneConfig
cloudControllerManager:
# featureGates:
# SomeKubernetesFeature: true
useCustomRouteController: true
# loadBalancerController:
# enabled: true
# ingressClassName: alb
# ipamController:
# enabled: true
storage:
managedDefaultClass: false
The cloudControllerManager.featureGates
contains a map of explicitly enabled or disabled feature gates.
For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability.
If you don’t want to configure anything for the cloudControllerManager
simply omit the key in the YAML specification.
The cloudControllerManager.useCustomRouteController
controls if the custom routes controller should be enabled.
If enabled, it will add routes to the pod CIDRs for all nodes in the route tables for all zones.
The storage.managedDefaultClass
controls if the default
storage / volume snapshot classes are marked as default by Gardener. Set it to false
to mark another storage / volume snapshot class as default without Gardener overwriting this change. If unset, this field defaults to true
.
If the AWS Load Balancer Controller should be deployed, set loadBalancerController.enabled
to true
.
In this case, it is assumed that an IngressClass
named alb
is created by the user.
You can overwrite the name by setting loadBalancerController.ingressClassName
.
Please note, that currently only the “instance” mode is supported.
Examples for Ingress
and Service
managed by the AWS Load Balancer Controller:
- Prerequisites
Make sure you have created an IngressClass
. For more details about parameters, please see AWS Load Balancer Controller - IngressClass
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
name: alb # default name if not specified by `loadBalancerController.ingressClassName`
spec:
controller: ingress.k8s.aws/alb
- Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
namespace: default
name: echoserver
annotations:
# complete set of annotations: https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/guide/ingress/annotations/
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: instance # target-type "ip" NOT supported in Gardener
spec:
ingressClassName: alb
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: echoserver
port:
number: 80
For more details see AWS Load Balancer Documentation - Ingress Specification
- Service of Type
LoadBalancer
This can be used to create a Network Load Balancer (NLB).
apiVersion: v1
kind: Service
metadata:
annotations:
# complete set of annotations: https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/guide/service/annotations/
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance # target-type "ip" NOT supported in Gardener
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
name: ingress-nginx-controller
namespace: ingress-nginx
...
spec:
...
type: LoadBalancer
loadBalancerClass: service.k8s.aws/nlb # mandatory to be managed by AWS Load Balancer Controller (otherwise the Cloud Controller Manager will act on it)
For more details see AWS Load Balancer Documentation - Network Load Balancer
⚠️ When using Network Load Balancers (NLB) as internal load balancers, it is crucial to add the annotation service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=false
. Without this annotation, if a request is routed by the NLB to the same target instance from which it originated, the client IP and destination IP will be identical. This situation, known as the hairpinning effect, will prevent the request from being processed.
WorkerConfig
The AWS extension supports encryption for volumes plus support for additional data volumes per machine.
For each data volume, you have to specify a name.
By default, (if not stated otherwise), all the disks (root & data volumes) are encrypted.
Please make sure that your instance-type supports encryption.
If your instance-type doesn’t support encryption, you will have to disable encryption (which is enabled by default) by setting volume.encrpyted
to false
(refer below shown YAML snippet).
The following YAML is a snippet of a Shoot
resource:
spec:
provider:
workers:
- name: cpu-worker
...
volume:
type: gp2
size: 20Gi
encrypted: false
dataVolumes:
- name: kubelet-dir
type: gp2
size: 25Gi
encrypted: true
Note: The AWS extension does not support EBS volume (root & data volumes) encryption with customer managed CMK. Support for customer managed CMK is out of scope for now. Only AWS managed CMK is supported.
Additionally, it is possible to provide further AWS-specific values for configuring the worker pools. The additional configuration must be specified in the providerConfig
field of the respective worker.
spec:
provider:
workers:
- name: cpu-worker
...
providerConfig:
# AWS worker config
The configuration will be evaluated when the provider-aws will reconcile the worker pools for the respective shoot.
An example WorkerConfig
for the AWS extension looks as follows:
spec:
provider:
workers:
- name: cpu-worker
...
providerConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: WorkerConfig
volume:
iops: 10000
throughput: 200
dataVolumes:
- name: kubelet-dir
iops: 12345
throughput: 150
snapshotID: snap-1234
iamInstanceProfile: # (specify either ARN or name)
name: my-profile
instanceMetadataOptions:
httpTokens: required
httpPutResponseHopLimit: 2
# arn: my-instance-profile-arn
nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime)
capacity:
cpu: 2 # inherited from pool's machine type if un-specified
gpu: 0 # inherited from pool's machine type if un-specified
memory: 50Gi # inherited from pool's machine type if un-specified
ephemeral-storage: 10Gi # override to specify explicit ephemeral-storage for scale fro zero
resource.com/dongle: 4 # Example of a custom, extended resource.
The .volume.iops
is the number of I/O operations per second (IOPS) that the volume supports.
For io1
and gp3
volume type, this represents the number of IOPS that are provisioned for the volume.
For gp2
volume type, this represents the baseline performance of the volume and the rate at which the volume accumulates I/O credits for bursting. For more information about General Purpose SSD baseline performance, I/O credits, IOPS range and bursting, see Amazon EBS Volume Types (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html) in the Amazon Elastic Compute Cloud User Guide.
Constraint: IOPS should be a positive value. Validation of IOPS (i.e. whether it is allowed and is in the specified range for a particular volume type) is done on aws side.
The volume.throughput
is the throughput that the volume supports, in MiB/s
. As of 16th Aug 2022
, this parameter is valid only for gp3
volume types and will return an error from the provider side if specified for other volume types. Its current range of throughput is from 125MiB/s
to 1000 MiB/s
. To know more about throughput and its range, see the official AWS documentation here.
The .dataVolumes
can optionally contain configurations for the data volumes stated in the Shoot
specification in the .spec.provider.workers[].dataVolumes
list.
The .name
must match to the name of the data volume in the shoot.
It is also possible to provide a snapshot ID. It allows to restore the data volume from an existing snapshot.
The iamInstanceProfile
section allows to specify the IAM instance profile name xor ARN that should be used for this worker pool.
If not specified, a dedicated IAM instance profile created by the infrastructure controller is used (see above).
The instanceMetadataOptions
controls access to the instance metadata service (IMDS) for members of the worker. You can do the following operations:
- access IMDSv1 (default)
- access IMDSv2 -
httpPutResponseHopLimit >= 2
- access IMDSv2 only (restrict access to IMDSv1) -
httpPutResponseHopLimit >=2
, httpTokens = "required"
- disable access to IMDS -
httpTokens = "required"
Note: The accessibility of IMDS discussed in the previous point is referenced from the point of view of containers NOT running in the host network.
By default on host network IMDSv2 is already enabled (but not accessible from inside the pods).
It is currently not possible to create a VM with complete restriction to the IMDS service. It is however possible to restrict access from inside the pods by setting httpTokens
to required
and not setting httpPutResponseHopLimit
(or setting it to 1).
You can find more information regarding the options in the AWS documentation.
cpuOptions
grants more finegrained control over the worker’s CPU configuration. It has two attributes:
coreCount
: Specify a custom amount of cores the instance should be configured with.threadsPerCore
: How many threads should there be on each core. Set to 1
to disable multi-threading.
Note that if you decide to configure cpuOptions
both these values need to be provided. For a list of valid combinations of these values refer to the AWS documentation.
Example Shoot
manifest (one availability zone)
Please find below an example Shoot
manifest for one availability zone:
apiVersion: core.gardener.cloud/v1beta1
kind: Shoot
metadata:
name: johndoe-aws
namespace: garden-dev
spec:
cloudProfile:
name: aws
region: eu-central-1
secretBindingName: core-aws
provider:
type: aws
infrastructureConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
vpc:
cidr: 10.250.0.0/16
zones:
- name: eu-central-1a
internal: 10.250.112.0/22
public: 10.250.96.0/22
workers: 10.250.0.0/19
controlPlaneConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: ControlPlaneConfig
workers:
- name: worker-xoluy
machine:
type: m5.large
minimum: 2
maximum: 2
volume:
size: 50Gi
type: gp2
# The following provider config is valid if the volume type is `io1`.
# providerConfig:
# apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
# kind: WorkerConfig
# volume:
# iops: 10000
zones:
- eu-central-1a
networking:
nodes: 10.250.0.0/16
type: calico
kubernetes:
version: 1.28.2
maintenance:
autoUpdate:
kubernetesVersion: true
machineImageVersion: true
addons:
kubernetesDashboard:
enabled: true
nginxIngress:
enabled: true
Example Shoot
manifest (three availability zones)
Please find below an example Shoot
manifest for three availability zones:
apiVersion: core.gardener.cloud/v1beta1
kind: Shoot
metadata:
name: johndoe-aws
namespace: garden-dev
spec:
cloudProfile:
name: aws
region: eu-central-1
secretBindingName: core-aws
provider:
type: aws
infrastructureConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
vpc:
cidr: 10.250.0.0/16
zones:
- name: eu-central-1a
workers: 10.250.0.0/26
public: 10.250.96.0/26
internal: 10.250.112.0/26
- name: eu-central-1b
workers: 10.250.0.64/26
public: 10.250.96.64/26
internal: 10.250.112.64/26
- name: eu-central-1c
workers: 10.250.0.128/26
public: 10.250.96.128/26
internal: 10.250.112.128/26
controlPlaneConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: ControlPlaneConfig
workers:
- name: worker-xoluy
machine:
type: m5.large
minimum: 3
maximum: 9
volume:
size: 50Gi
type: gp2
zones:
- eu-central-1a
- eu-central-1b
- eu-central-1c
networking:
nodes: 10.250.0.0/16
type: calico
kubernetes:
version: 1.28.2
maintenance:
autoUpdate:
kubernetesVersion: true
machineImageVersion: true
addons:
kubernetesDashboard:
enabled: true
nginxIngress:
enabled: true
Example Shoot
manifest (IPv6)
Please find below an example Shoot
manifest for an IPv6 shoot cluster:
apiVersion: core.gardener.cloud/v1beta1
kind: Shoot
metadata:
name: johndoe-aws-ipv6
namespace: garden-dev
spec:
cloudProfile:
name: aws
region: eu-central-1
secretBindingName: core-aws
provider:
type: aws
infrastructureConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
vpc:
cidr: 10.250.0.0/16
zones:
- name: eu-central-1a
public: 10.250.96.0/22
internal: 10.250.112.0/22
controlPlaneConfig:
apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
kind: ControlPlaneConfig
workers:
- ...
networking:
ipFamilies:
- IPv6
type: calico
kubernetes:
version: 1.28.2
...
addons:
kubernetesDashboard:
enabled: true
nginxIngress:
enabled: false
CSI volume provisioners
Every AWS shoot cluster will be deployed with the AWS EBS CSI driver.
It is compatible with the legacy in-tree volume provisioner that was deprecated by the Kubernetes community and will be removed in future versions of Kubernetes.
End-users might want to update their custom StorageClass
es to the new ebs.csi.aws.com
provisioner.
Node-specific Volume Limits
The Kubernetes scheduler allows configurable limit for the number of volumes that can be attached to a node.
See https://k8s.io/docs/concepts/storage/storage-limits/#custom-limits.
CSI drivers usually have a different procedure for configuring this custom limit.
By default, the EBS CSI driver parses the machine type name and then decides the volume limit.
However, this is only a rough approximation and not good enough in most cases.
Specifying the volume attach limit via command line flag (--volume-attach-limit
) is currently the alternative until a more sophisticated solution presents itself (dynamically discovering the maximum number of attachable volume per EC2 machine type, see also https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/347).
The AWS extension allows the --volume-attach-limit
flag of the EBS CSI driver to be configurable via aws.provider.extensions.gardener.cloud/volume-attach-limit
annotation on the Shoot
resource.
ℹ️ Please note: If the annotation is added to an existing Shoot
, then reconciliation needs to be triggered manually (see Immediate reconciliation), as adding an annotation to a resource is not a change that leads to an increase of .metadata.generation
in general.
Other CSI options
The newer versions of EBS CSI driver are not readily compatible with the use of XFS volumes on nodes using a kernel version <= 5.4.
A workaround was added that enables the use of a “legacy XFS” mode that introduces a backwards compatible volume formating for the older kernels.
You can enable this option for your shoot by annotating it with aws.provider.extensions.gardener.cloud/legacy-xfs=true
.
ℹ️ Please note: If the annotation is added to an existing Shoot
, then reconciliation needs to be triggered manually (see Immediate reconciliation), as adding an annotation to a resource is not a change that leads to an increase of .metadata.generation
in general.
Kubernetes Versions per Worker Pool
This extension supports gardener/gardener
’s WorkerPoolKubernetesVersion
feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-aws@v1.34
.
Shoot CA Certificate and ServiceAccount
Signing Key Rotation
This extension supports gardener/gardener
’s ShootCARotation
and ShootSARotation
feature gates since gardener-extension-provider-aws@v1.36
.
Flow Infrastructure Reconciler
The extension offers two different reconciler implementations for the infrastructure resource:
- terraform-based
- native Go SDK based (dubbed the “flow”-based implementation)
The default implementation currently is the terraform reconciler which uses the https://github.com/gardener/terraformer
as the backend for managing the shoot’s infrastructure.
The “flow” implementation is a newer implementation that is trying to solve issues we faced with managing terraform infrastructure on Kubernetes. The goal is to have more control over the reconciliation process and be able to perform fine-grained tuning over it. The implementation is completely backwards-compatible and offers a migration route from the legacy terraformer implementation.
For most users there will be no noticeable difference. However for certain use-cases, users may notice a slight deviation from the previous behavior. For example, with flow-based infrastructure users may be able to perform certain modifications to infrastructure resources without having them reconciled back by terraform. Operations that would degrade the shoot infrastructure are still expected to be reverted back.
For the time-being, to take advantage of the flow reconciler users have to “opt-in” by annotating the shoot manifest with: aws.provider.extensions.gardener.cloud/use-flow="true"
. For existing shoots with this annotation, the migration will take place on the next infrastructure reconciliation (on maintenance window or if other infrastructure changes are requested). The migration is not revertible.