Provider Alicloud
Gardener extension controller for the Alibaba cloud provider
Project Gardener implements the automated management and operation of Kubernetes clusters as a service.
Its main principle is to leverage Kubernetes concepts for all of its tasks.
Recently, most of the vendor specific logic has been developed in-tree.
However, the project has grown to a size where it is very hard to extend, maintain, and test.
With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics.
This way, we can keep Gardener core clean and independent.
This controller implements Gardener’s extension contract for the Alicloud provider.
An example for a ControllerRegistration
resource that can be used to register this controller to Gardener can be found here.
Please find more information regarding the extensibility concepts and a detailed proposal here.
Supported Kubernetes versions
This extension controller supports the following Kubernetes versions:
Version | Support | Conformance test results |
---|
Kubernetes 1.31 | 1.31.0+ | |
Kubernetes 1.30 | 1.30.0+ | |
Kubernetes 1.29 | 1.29.0+ | |
Kubernetes 1.28 | 1.28.0+ | |
Kubernetes 1.27 | 1.27.0+ | |
Kubernetes 1.26 | 1.26.0+ | |
Kubernetes 1.25 | 1.25.0+ | |
Please take a look here to see which versions are supported by Gardener in general.
How to start using or developing this extension controller locally
You can run the controller locally on your machine by executing make start
.
Static code checks and tests can be executed by running make verify
. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.
Feedback and Support
Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).
Learn more!
Please find further resources about out project here:
1.1 - Create a Kubernetes Cluster on Alibaba Cloud with Gardener
Overview
Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on Alibaba Cloud.
Prerequisites
- You have created an Alibaba Cloud account.
- You have access to the Gardener dashboard and have permissions to create projects.
Steps
Go to the Gardener dashboard and create a project.
To be able to add shoot clusters to this project, you must first create a technical user on Alibaba Cloud with sufficient permissions.
Choose Secrets, then the plus icon and select AliCloud.
To copy the policy for Alibaba Cloud from the Gardener dashboard, click on the help icon for Alibaba Cloud secrets, and choose copy .
Create a custom policy in Alibaba Cloud:
Log on to your Alibaba account and choose RAM > Permissions > Policies.
Enter the name of your policy.
Select Script
.
Paste the policy that you copied from the Gardener dashboard to this custom policy.
Choose OK.
In the Alibaba Cloud console, create a new technical user:
Choose RAM > Users.
Choose Create User.
Enter a logon and display name for your user.
Select Open API Access.
Choose OK.
After the user is created, AccessKeyId
and AccessKeySecret
are generated and displayed. Remember to save them. The AccessKey
is used later to create secrets for Gardener.
Assign the policy you created to the technical user:
Choose RAM > Permissions > Grants.
Choose Grant Permission.
Select Alibaba Cloud Account.
Assign the policy you’ve created before to the technical user.
Create your secret.
- Type the name of your secret.
- Copy and paste the
Access Key ID
and Secret Access Key
you saved when you created the technical user on Alibaba Cloud. - Choose Add secret.
After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.
To create a new cluster, choose Clusters and then the plus sign in the upper right corner.
In the Create Cluster section:
Select AliCloud in the Infrastructure tab.
Type the name of your cluster in the Cluster Details tab.
Choose the secret you created before in the Infrastructure Details tab.
Choose Create.
Wait for your cluster to get created.
Result
After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster. With it you can create shoot clusters on Alibaba Cloud.
The size of persistent volumes in your shoot cluster must at least be 20 GiB large. If you choose smaller sizes in your Kubernetes PV definition, the allocation of cloud disk space on Alibaba Cloud fails.
2 - Deployment
Deployment of the AliCloud provider extension
Disclaimer: This document is NOT a step by step installation guide for the AliCloud provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the AliCloud provider extension repository.
gardener-extension-admission-alicloud
Authentication against the Garden cluster
There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.
Virtual Garden is not used, i.e., the runtime
Garden cluster is also the target
Garden cluster.
Automounted Service Account Token
The easiest way to deploy the gardener-extension-admission-alicloud
component will be to not provide kubeconfig
at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.
This will allow for automatic rotation of the service account token by the kubelet
. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true
and .Values.global.kubeconfig
in the respective chart’s values.yaml
file.
Virtual Garden is used, i.e., the runtime
Garden cluster is different from the target
Garden cluster.
Service Account
The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target
cluster. Then use the generated service account token and craft a kubeconfig
which will be used by the workload in the runtime
cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true
and following these steps:
- Deploy the
application
part of the charts in the target
cluster. - Get the service account token and craft the
kubeconfig
. - Set the crafted
kubeconfig
and deploy the runtime
part of the charts in the runtime
cluster.
Client Certificate
Another solution will be to bind the roles in the target
cluster to a User
subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true
and .Values.global.virtualGarden.user.name
, then following these steps:
- Generate a client certificate for the
target
cluster for the respective user. - Deploy the
application
part of the charts in the target
cluster. - Craft a
kubeconfig
using the already generated client certificate. - Set the crafted
kubeconfig
and deploy the runtime
part of the charts in the runtime
cluster.
3 - Local Setup
admission-alicloud
admission-alicloud
is an admission webhook server which is responsible for the validation of the cloud provider (Alicloud in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.
Follow the steps below to run the admission webhook server locally.
Start the Gardener API server.
For details, check the Gardener local setup.
Start the webhook server
Make sure that the KUBECONFIG
environment variable is pointing to the local garden cluster.
Setup the ValidatingWebhookConfiguration
.
hack/dev-setup-admission-alicloud.sh
will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration
manifest.
./hack/dev-setup-admission-alicloud.sh
You are now ready to experiment with the admission-alicloud
webhook server locally.
4 - Operations
Using the Alicloud provider extension with Gardener as operator
The core.gardener.cloud/v1beta1.CloudProfile
resource declares a providerConfig
field that is meant to contain provider-specific configuration.
The core.gardener.cloud/v1beta1.Seed
resource is structured similarly.
Additionally, it allows configuring settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.
This document explains the necessary configuration for this provider extension. In addition, this document also describes how to enable the use of customized machine images for Alicloud.
CloudProfile
resource
This section describes, how the configuration for CloudProfile
looks like for Alicloud by providing an example CloudProfile
manifest with minimal configuration that can be used to allow the creation of Alicloud shoot clusters.
CloudProfileConfig
The cloud profile configuration contains information about the real machine image IDs in the Alicloud environment (AMIs).
You have to map every version that you specify in .spec.machineImages[].versions
here such that the Alicloud extension knows the AMI for every version you want to offer.
An example CloudProfileConfig
for the Alicloud extension looks as follows:
apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: CloudProfileConfig
machineImages:
- name: coreos
versions:
- version: 2023.4.0
regions:
- name: eu-central-1
id: coreos_2023_4_0_64_30G_alibase_20190319.vhd
Example CloudProfile
manifest
Please find below an example CloudProfile
manifest:
apiVersion: core.gardener.cloud/v1beta1
kind: CloudProfile
metadata:
name: alicloud
spec:
type: alicloud
kubernetes:
versions:
- version: 1.27.3
- version: 1.26.8
expirationDate: "2022-10-31T23:59:59Z"
machineImages:
- name: coreos
versions:
- version: 2023.4.0
machineTypes:
- name: ecs.sn2ne.large
cpu: "2"
gpu: "0"
memory: 8Gi
volumeTypes:
- name: cloud_efficiency
class: standard
- name: cloud_essd
class: premium
regions:
- name: eu-central-1
zones:
- name: eu-central-1a
- name: eu-central-1b
providerConfig:
apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: CloudProfileConfig
machineImages:
- name: coreos
versions:
- version: 2023.4.0
regions:
- name: eu-central-1
id: coreos_2023_4_0_64_30G_alibase_20190319.vhd
Enable customized machine images for the Alicloud extension
Customized machine images can be created for an Alicloud account and shared with other Alicloud accounts.
The same customized machine image has different image ID in different regions on Alicloud.
If you need to enable encrypted system disk
, you must provide customized machine images.
Administrators/Operators need to explicitly declare them per imageID per region as below:
machineImages:
- name: customized_coreos
regions:
- imageID: <image_id_in_eu_central_1>
region: eu-central-1
- imageID: <image_id_in_cn_shanghai>
region: cn-shanghai
...
version: 2191.4.1
...
End-users have to have the permission to use the customized image from its creator Alicloud account. To enable end-users to use customized images, the images are shared from Alicloud account of Seed operator with end-users’ Alicloud accounts. Administrators/Operators need to explicitly provide Seed operator’s Alicloud account access credentials (base64 encoded) as below:
machineImageOwnerSecret:
name: machine-image-owner
accessKeyID: <base64_encoded_access_key_id>
accessKeySecret: <base64_encoded_access_key_secret>
As a result, a Secret named machine-image-owner
by default will be created in namespace of Alicloud provider extension.
Operators should also maintain custom image IDs which are to be shared with end-users as below:
toBeSharedImageIDs:
- <image_id_1>
- <image_id_2>
- <image_id_3>
Example ControllerDeployment
manifest for enabling customized machine images
apiVersion: core.gardener.cloud/v1beta1
kind: ControllerDeployment
metadata:
name: extension-provider-alicloud
spec:
type: helm
providerConfig:
chart: |
H4sIFAAAAAAA/yk...
values:
config:
machineImageOwnerSecret:
accessKeyID: <base64_encoded_access_key_id>
accessKeySecret: <base64_encoded_access_key_secret>
toBeSharedImageIDs:
- <image_id_1>
- <image_id_2>
...
machineImages:
- name: customized_coreos
regions:
- imageID: <image_id_in_eu_central_1>
region: eu-central-1
- imageID: <image_id_in_cn_shanghai>
region: cn-shanghai
...
version: 2191.4.1
...
csi:
enableADController: true
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
memory: 128Mi
Seed
resource
This provider extension does not support any provider configuration for the Seed
’s .spec.provider.providerConfig
field.
However, it supports to managing of backup infrastructure, i.e., you can specify a configuration for the .spec.backup
field.
Backup configuration
A Seed of type alicloud
can be configured to perform backups for the main etcds’ of the shoot clusters control planes using Alicloud Object Storage Service.
The location/region where the backups will be stored defaults to the region of the Seed (spec.provider.region
).
Please find below an example Seed
manifest (partly) that configures backups using Alicloud Object Storage Service.
---
apiVersion: core.gardener.cloud/v1beta1
kind: Seed
metadata:
name: my-seed
spec:
provider:
type: alicloud
region: cn-shanghai
backup:
provider: alicloud
secretRef:
name: backup-credentials
namespace: garden
...
An example of the referenced secret containing the credentials for the Alicloud Object Storage Service can be found in the example folder.
Permissions for Alicloud Object Storage Service
Please make sure the RAM user associated with the provided AccessKey pair has the following permission.
5 - Usage
Using the Alicloud provider extension with Gardener as end-user
The core.gardener.cloud/v1beta1.Shoot
resource declares a few fields that are meant to contain provider-specific configuration.
This document describes the configurable options for Alicloud and provides an example Shoot
manifest with minimal configuration that can be used to create an Alicloud cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).
Alicloud Provider Credentials
In order for Gardener to create a Kubernetes cluster using Alicloud infrastructure components, a Shoot has to provide credentials with sufficient permissions to the desired Alicloud project.
Every shoot cluster references a SecretBinding
or a CredentialsBinding
which itself references a Secret
, and this Secret
contains the provider credentials of the Alicloud project.
This Secret
must look as follows:
apiVersion: v1
kind: Secret
metadata:
name: core-alicloud
namespace: garden-dev
type: Opaque
data:
accessKeyID: base64(access-key-id)
accessKeySecret: base64(access-key-secret)
The SecretBinding
/CredentialsBinding
is configurable in the Shoot cluster with the field secretBindingName
/credentialsBindingName
.
The required credentials for the Alicloud project are an AccessKey Pair associated with a Resource Access Management (RAM) User.
A RAM user is a special account that can be used by services and applications to interact with Alicloud Cloud Platform APIs.
Applications can use AccessKey pair to authorize themselves to a set of APIs and perform actions within the permissions granted to the RAM user.
Make sure to create a Resource Access Management User, and create an AccessKey Pair that shall be used for the Shoot cluster.
Permissions
Please make sure the provided credentials have the correct privileges. You can use the following Alicloud RAM policy document and attach it to the RAM user backed by the credentials you provided.
Click to expand the Alicloud RAM policy document!
{
"Statement": [
{
"Action": [
"vpc:*"
],
"Effect": "Allow",
"Resource": [
"*"
]
},
{
"Action": [
"ecs:*"
],
"Effect": "Allow",
"Resource": [
"*"
]
},
{
"Action": [
"slb:*"
],
"Effect": "Allow",
"Resource": [
"*"
]
},
{
"Action": [
"ram:GetRole",
"ram:CreateRole",
"ram:CreateServiceLinkedRole"
],
"Effect": "Allow",
"Resource": [
"*"
]
},
{
"Action": [
"ros:*"
],
"Effect": "Allow",
"Resource": [
"*"
]
}
],
"Version": "1"
}
InfrastructureConfig
The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.
An example InfrastructureConfig
for the Alicloud extension looks as follows:
apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
vpc: # specify either 'id' or 'cidr'
# id: my-vpc
cidr: 10.250.0.0/16
# gardenerManagedNATGateway: true
zones:
- name: eu-central-1a
workers: 10.250.1.0/24
# natGateway:
# eipAllocationID: eip-ufxsdg122elmszcg
The networks.vpc
section describes whether you want to create the shoot cluster in an already existing VPC or whether to create a new one:
- If
networks.vpc.id
is given then you have to specify the VPC ID of the existing VPC that was created by other means (manually, other tooling, …). - If
networks.vpc.cidr
is given then you have to specify the VPC CIDR of a new VPC that will be created during shoot creation.
You can freely choose a private CIDR range. - Either
networks.vpc.id
or networks.vpc.cidr
must be present, but not both at the same time. - When
networks.vpc.id
is present, in addition, you can also choose to set networks.vpc.gardenerManagedNATGateway
. It is by default false
. When it is set to true
,
Gardener will create an Enhanced NATGateway in the VPC and associate it with a VSwitch created in the first zone in the networks.zones
. - Please note that when
networks.vpc.id
is present, and networks.vpc.gardenerManagedNATGateway
is false
or not set, you have to manually create an Enhance NATGateway
and associate it with a VSwitch that you manually created. In this case, make sure the worker CIDRs in networks.zones
do not overlap with the one you created.
If a NATGateway is created manually and a shoot is created in the same VPC with networks.vpc.gardenerManagedNATGateway
set true
, you need to manually adjust the route rule accordingly.
You may refer to here.
The networks.zones
section describes which subnets you want to create in availability zones.
For every zone, the Alicloud extension creates one subnet:
- The
workers
subnet is used for all shoot worker nodes, i.e., VMs which later run your applications.
For every subnet, you have to specify a CIDR range contained in the VPC CIDR specified above, or the VPC CIDR of your already existing VPC.
You can freely choose these CIDR and it is your responsibility to properly design the network layout to suit your needs.
If you want to use multiple availability zones then add a second, third, … entry to the networks.zones[]
list and properly specify the AZ name in networks.zones[].name
.
Apart from the VPC and the subnets the Alicloud extension will also create a NAT gateway (only if a new VPC is created), a key pair, elastic IPs, VSwitches, a SNAT table entry, and security groups.
By default, the Alicloud extension will create a corresponding Elastic IP that it attaches to this NAT gateway and which is used for egress traffic.
The networks.zones[].natGateway.eipAllocationID
field allows you to specify the Elastic IP Allocation ID of an existing Elastic IP allocation in case you want to bring your own.
If provided, no new Elastic IP will be created and, instead, the Elastic IP specified by you will be used.
⚠️ If you change this field for an already existing infrastructure then it will disrupt egress traffic while Alicloud applies this change, because the NAT gateway must be recreated with the new Elastic IP association.
Also, please note that the existing Elastic IP will be permanently deleted if it was earlier created by the Alicloud extension.
ControlPlaneConfig
The control plane configuration mainly contains values for the Alicloud-specific control plane components.
Today, the Alicloud extension deploys the cloud-controller-manager
and the CSI controllers.
An example ControlPlaneConfig
for the Alicloud extension looks as follows:
apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: ControlPlaneConfig
csi:
enableADController: true
# cloudControllerManager:
# featureGates:
# SomeKubernetesFeature: true
The csi.enableADController
is used as the value of environment DISK_AD_CONTROLLER, which is used for AliCloud csi-disk-plugin. This field is optional. When a new shoot is creatd, this field is automatically set true. For an existing shoot created in previous versions, it remains unchanged. If there are persistent volumes created before year 2021, please be cautious to set this field true because they may fail to mount to nodes.
The cloudControllerManager.featureGates
contains a map of explicitly enabled or disabled feature gates.
For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability.
If you don’t want to configure anything for the cloudControllerManager
simply omit the key in the YAML specification.
WorkerConfig
The Alicloud extension does not support a specific WorkerConfig
. However, it supports additional data volumes (plus encryption) per machine.
By default (if not stated otherwise), all the disks are unencrypted.
For each data volume, you have to specify a name.
It also supports encrypted system disk.
However, only Customized image is currently supported to be used as a basic image for encrypted system disk.
Please be noted that the change of system disk encryption flag will cause reconciliation of a shoot, and it will result in nodes rolling update within the worker group.
The following YAML is a snippet of a Shoot
resource:
spec:
provider:
workers:
- name: cpu-worker
...
volume:
type: cloud_efficiency
size: 20Gi
encrypted: true
dataVolumes:
- name: kubelet-dir
type: cloud_efficiency
size: 25Gi
encrypted: true
Example Shoot
manifest (one availability zone)
Please find below an example Shoot
manifest for one availability zone:
apiVersion: core.gardener.cloud/v1beta1
kind: Shoot
metadata:
name: johndoe-alicloud
namespace: garden-dev
spec:
cloudProfileName: alicloud
region: eu-central-1
secretBindingName: core-alicloud
provider:
type: alicloud
infrastructureConfig:
apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
vpc:
cidr: 10.250.0.0/16
zones:
- name: eu-central-1a
workers: 10.250.0.0/19
controlPlaneConfig:
apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: ControlPlaneConfig
workers:
- name: worker-xoluy
machine:
type: ecs.sn2ne.large
minimum: 2
maximum: 2
volume:
size: 50Gi
type: cloud_efficiency
zones:
- eu-central-1a
networking:
nodes: 10.250.0.0/16
type: calico
kubernetes:
version: 1.28.2
maintenance:
autoUpdate:
kubernetesVersion: true
machineImageVersion: true
addons:
kubernetesDashboard:
enabled: true
nginxIngress:
enabled: true
Example Shoot
manifest (two availability zones)
Please find below an example Shoot
manifest for two availability zones:
apiVersion: core.gardener.cloud/v1beta1
kind: Shoot
metadata:
name: johndoe-alicloud
namespace: garden-dev
spec:
cloudProfileName: alicloud
region: eu-central-1
secretBindingName: core-alicloud
provider:
type: alicloud
infrastructureConfig:
apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
vpc:
cidr: 10.250.0.0/16
zones:
- name: eu-central-1a
workers: 10.250.0.0/26
- name: eu-central-1b
workers: 10.250.0.64/26
controlPlaneConfig:
apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: ControlPlaneConfig
workers:
- name: worker-xoluy
machine:
type: ecs.sn2ne.large
minimum: 2
maximum: 4
volume:
size: 50Gi
type: cloud_efficiency
# NOTE: Below comment is for the case when encrypted field of an existing shoot is updated from false to true.
# It will cause affected nodes to be rolling updated. Users must trigger a MAINTAIN operation of the shoot.
# Otherwise, the shoot will fail to reconcile.
# You could do it either via Dashboard or annotating the shoot with gardener.cloud/operation=maintain
encrypted: true
zones:
- eu-central-1a
- eu-central-1b
networking:
nodes: 10.250.0.0/16
type: calico
kubernetes:
version: 1.28.2
maintenance:
autoUpdate:
kubernetesVersion: true
machineImageVersion: true
addons:
kubernetesDashboard:
enabled: true
nginxIngress:
enabled: true
Kubernetes Versions per Worker Pool
This extension supports gardener/gardener
’s WorkerPoolKubernetesVersion
feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-alicloud@v1.33
.
Shoot CA Certificate and ServiceAccount
Signing Key Rotation
This extension supports gardener/gardener
’s ShootCARotation
feature gate since gardener-extension-provider-alicloud@v1.36
and ShootSARotation
feature gate since gardener-extension-provider-alicloud@v1.37
.