This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Provider Alicloud

Gardener extension controller for the Alibaba cloud provider

Gardener Extension for Alicloud provider

REUSE status CI Build status Go Report Card

Project Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.

Recently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.

This controller implements Gardener’s extension contract for the Alicloud provider.

An example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.

Please find more information regarding the extensibility concepts and a detailed proposal here.

Supported Kubernetes versions

This extension controller supports the following Kubernetes versions:

VersionSupportConformance test results
Kubernetes 1.311.31.0+Gardener v1.31 Conformance Tests
Kubernetes 1.301.30.0+Gardener v1.30 Conformance Tests
Kubernetes 1.291.29.0+Gardener v1.29 Conformance Tests
Kubernetes 1.281.28.0+Gardener v1.28 Conformance Tests
Kubernetes 1.271.27.0+Gardener v1.27 Conformance Tests
Kubernetes 1.261.26.0+Gardener v1.26 Conformance Tests
Kubernetes 1.251.25.0+Gardener v1.25 Conformance Tests

Please take a look here to see which versions are supported by Gardener in general.


How to start using or developing this extension controller locally

You can run the controller locally on your machine by executing make start.

Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.

Feedback and Support

Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).

Learn more!

Please find further resources about out project here:

1 - Tutorials

1.1 - Create a Kubernetes Cluster on Alibaba Cloud with Gardener

Overview

Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on Alibaba Cloud.

Prerequisites

  • You have created an Alibaba Cloud account.
  • You have access to the Gardener dashboard and have permissions to create projects.

Steps

  1. Go to the Gardener dashboard and create a project.

    To be able to add shoot clusters to this project, you must first create a technical user on Alibaba Cloud with sufficient permissions.

  2. Choose Secrets, then the plus icon and select AliCloud.

  3. To copy the policy for Alibaba Cloud from the Gardener dashboard, click on the help icon for Alibaba Cloud secrets, and choose copy .

  4. Create a custom policy in Alibaba Cloud:

    1. Log on to your Alibaba account and choose RAM > Permissions > Policies.

    2. Enter the name of your policy.

    3. Select Script.

    4. Paste the policy that you copied from the Gardener dashboard to this custom policy.

    5. Choose OK.

  5. In the Alibaba Cloud console, create a new technical user:

    1. Choose RAM > Users.

    2. Choose Create User.

    3. Enter a logon and display name for your user.

    4. Select Open API Access.

    5. Choose OK.

    After the user is created, AccessKeyId and AccessKeySecret are generated and displayed. Remember to save them. The AccessKey is used later to create secrets for Gardener.

  6. Assign the policy you created to the technical user:

    1. Choose RAM > Permissions > Grants.

    2. Choose Grant Permission.

    3. Select Alibaba Cloud Account.

    4. Assign the policy you’ve created before to the technical user.

  7. Create your secret.

    1. Type the name of your secret.
    2. Copy and paste the Access Key ID and Secret Access Key you saved when you created the technical user on Alibaba Cloud.
    3. Choose Add secret.

    After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.

  8. To create a new cluster, choose Clusters and then the plus sign in the upper right corner.

  9. In the Create Cluster section:

    1. Select AliCloud in the Infrastructure tab.

    2. Type the name of your cluster in the Cluster Details tab.

    3. Choose the secret you created before in the Infrastructure Details tab.

    4. Choose Create.

  10. Wait for your cluster to get created.

Result

After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster. With it you can create shoot clusters on Alibaba Cloud.

The size of persistent volumes in your shoot cluster must at least be 20 GiB large. If you choose smaller sizes in your Kubernetes PV definition, the allocation of cloud disk space on Alibaba Cloud fails.

2 - Deployment

Deployment of the AliCloud provider extension

Disclaimer: This document is NOT a step by step installation guide for the AliCloud provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the AliCloud provider extension repository.

gardener-extension-admission-alicloud

Authentication against the Garden cluster

There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.

Virtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster.

Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-alicloud component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.

This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.

Virtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster.

Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:

  1. Deploy the application part of the charts in the target cluster.
  2. Get the service account token and craft the kubeconfig.
  3. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster.

Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:

  1. Generate a client certificate for the target cluster for the respective user.
  2. Deploy the application part of the charts in the target cluster.
  3. Craft a kubeconfig using the already generated client certificate.
  4. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster.

3 - Local Setup

admission-alicloud

admission-alicloud is an admission webhook server which is responsible for the validation of the cloud provider (Alicloud in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.

Follow the steps below to run the admission webhook server locally.

  1. Start the Gardener API server.

    For details, check the Gardener local setup.

  2. Start the webhook server

    Make sure that the KUBECONFIG environment variable is pointing to the local garden cluster.

    make start-admission
    
  3. Setup the ValidatingWebhookConfiguration.

    hack/dev-setup-admission-alicloud.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.

    ./hack/dev-setup-admission-alicloud.sh
    

You are now ready to experiment with the admission-alicloud webhook server locally.

4 - Operations

Using the Alicloud provider extension with Gardener as operator

The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration. The core.gardener.cloud/v1beta1.Seed resource is structured similarly. Additionally, it allows configuring settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.

This document explains the necessary configuration for this provider extension. In addition, this document also describes how to enable the use of customized machine images for Alicloud.

CloudProfile resource

This section describes, how the configuration for CloudProfile looks like for Alicloud by providing an example CloudProfile manifest with minimal configuration that can be used to allow the creation of Alicloud shoot clusters.

CloudProfileConfig

The cloud profile configuration contains information about the real machine image IDs in the Alicloud environment (AMIs). You have to map every version that you specify in .spec.machineImages[].versions here such that the Alicloud extension knows the AMI for every version you want to offer.

An example CloudProfileConfig for the Alicloud extension looks as follows:

apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: CloudProfileConfig
machineImages:
- name: coreos
  versions:
  - version: 2023.4.0
    regions:
    - name: eu-central-1
      id: coreos_2023_4_0_64_30G_alibase_20190319.vhd

Example CloudProfile manifest

Please find below an example CloudProfile manifest:

apiVersion: core.gardener.cloud/v1beta1
kind: CloudProfile
metadata:
  name: alicloud
spec:
  type: alicloud
  kubernetes:
    versions:
    - version: 1.27.3
    - version: 1.26.8
      expirationDate: "2022-10-31T23:59:59Z"
  machineImages:
  - name: coreos
    versions:
    - version: 2023.4.0
  machineTypes:
  - name: ecs.sn2ne.large
    cpu: "2"
    gpu: "0"
    memory: 8Gi
  volumeTypes:
  - name: cloud_efficiency
    class: standard
  - name: cloud_essd
    class: premium
  regions:
  - name: eu-central-1
    zones:
    - name: eu-central-1a
    - name: eu-central-1b
  providerConfig:
    apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
    kind: CloudProfileConfig
    machineImages:
    - name: coreos
      versions:
      - version: 2023.4.0
        regions:
        - name: eu-central-1
          id: coreos_2023_4_0_64_30G_alibase_20190319.vhd

Enable customized machine images for the Alicloud extension

Customized machine images can be created for an Alicloud account and shared with other Alicloud accounts. The same customized machine image has different image ID in different regions on Alicloud. If you need to enable encrypted system disk, you must provide customized machine images. Administrators/Operators need to explicitly declare them per imageID per region as below:

machineImages:
- name: customized_coreos
  regions:
  - imageID: <image_id_in_eu_central_1>
    region: eu-central-1
  - imageID: <image_id_in_cn_shanghai>
    region: cn-shanghai
  ...
  version: 2191.4.1
...

End-users have to have the permission to use the customized image from its creator Alicloud account. To enable end-users to use customized images, the images are shared from Alicloud account of Seed operator with end-users’ Alicloud accounts. Administrators/Operators need to explicitly provide Seed operator’s Alicloud account access credentials (base64 encoded) as below:

machineImageOwnerSecret:
  name: machine-image-owner
  accessKeyID: <base64_encoded_access_key_id>
  accessKeySecret: <base64_encoded_access_key_secret>

As a result, a Secret named machine-image-owner by default will be created in namespace of Alicloud provider extension.

Operators should also maintain custom image IDs which are to be shared with end-users as below:

toBeSharedImageIDs:
- <image_id_1>
- <image_id_2>
- <image_id_3>

Example ControllerDeployment manifest for enabling customized machine images

apiVersion: core.gardener.cloud/v1beta1
kind: ControllerDeployment
metadata:
  name: extension-provider-alicloud
spec:
  type: helm
   providerConfig:
    chart: |
      H4sIFAAAAAAA/yk...      
    values:
      config:
        machineImageOwnerSecret:
          accessKeyID: <base64_encoded_access_key_id>
          accessKeySecret: <base64_encoded_access_key_secret>
        toBeSharedImageIDs:
        - <image_id_1>
        - <image_id_2>
        ...
        machineImages:
        - name: customized_coreos
          regions:
          - imageID: <image_id_in_eu_central_1>
            region: eu-central-1
          - imageID: <image_id_in_cn_shanghai>
            region: cn-shanghai
          ...
          version: 2191.4.1
        ...
        csi:
          enableADController: true
      resources:
        limits:
          cpu: 500m
          memory: 1Gi
        requests:
          memory: 128Mi

Seed resource

This provider extension does not support any provider configuration for the Seed’s .spec.provider.providerConfig field. However, it supports to managing of backup infrastructure, i.e., you can specify a configuration for the .spec.backup field.

Backup configuration

A Seed of type alicloud can be configured to perform backups for the main etcds’ of the shoot clusters control planes using Alicloud Object Storage Service.

The location/region where the backups will be stored defaults to the region of the Seed (spec.provider.region).

Please find below an example Seed manifest (partly) that configures backups using Alicloud Object Storage Service.

---
apiVersion: core.gardener.cloud/v1beta1
kind: Seed
metadata:
  name: my-seed
spec:
  provider:
    type: alicloud
    region: cn-shanghai
  backup:
    provider: alicloud
    secretRef:
      name: backup-credentials
      namespace: garden
  ...

An example of the referenced secret containing the credentials for the Alicloud Object Storage Service can be found in the example folder.

Permissions for Alicloud Object Storage Service

Please make sure the RAM user associated with the provided AccessKey pair has the following permission.

  • AliyunOSSFullAccess

5 - Usage

Using the Alicloud provider extension with Gardener as end-user

The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.

This document describes the configurable options for Alicloud and provides an example Shoot manifest with minimal configuration that can be used to create an Alicloud cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).

Alicloud Provider Credentials

In order for Gardener to create a Kubernetes cluster using Alicloud infrastructure components, a Shoot has to provide credentials with sufficient permissions to the desired Alicloud project. Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of the Alicloud project.

This Secret must look as follows:

apiVersion: v1
kind: Secret
metadata:
  name: core-alicloud
  namespace: garden-dev
type: Opaque
data:
  accessKeyID: base64(access-key-id)
  accessKeySecret: base64(access-key-secret)

The SecretBinding/CredentialsBinding is configurable in the Shoot cluster with the field secretBindingName/credentialsBindingName.

The required credentials for the Alicloud project are an AccessKey Pair associated with a Resource Access Management (RAM) User. A RAM user is a special account that can be used by services and applications to interact with Alicloud Cloud Platform APIs. Applications can use AccessKey pair to authorize themselves to a set of APIs and perform actions within the permissions granted to the RAM user.

Make sure to create a Resource Access Management User, and create an AccessKey Pair that shall be used for the Shoot cluster.

Permissions

Please make sure the provided credentials have the correct privileges. You can use the following Alicloud RAM policy document and attach it to the RAM user backed by the credentials you provided.

Click to expand the Alicloud RAM policy document!
{
    "Statement": [
        {
            "Action": [
                "vpc:*"
            ],
            "Effect": "Allow",
            "Resource": [
                "*"
            ]
        },
        {
            "Action": [
                "ecs:*"
            ],
            "Effect": "Allow",
            "Resource": [
                "*"
            ]
        },
        {
            "Action": [
                "slb:*"
            ],
            "Effect": "Allow",
            "Resource": [
                "*"
            ]
        },
        {
            "Action": [
                "ram:GetRole",
                "ram:CreateRole",
                "ram:CreateServiceLinkedRole"
            ],
            "Effect": "Allow",
            "Resource": [
                "*"
            ]
        },
        {
            "Action": [
                "ros:*"
            ],
            "Effect": "Allow",
            "Resource": [
                "*"
            ]
        }
    ],
    "Version": "1"
}

InfrastructureConfig

The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.

An example InfrastructureConfig for the Alicloud extension looks as follows:

apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: InfrastructureConfig
networks:
  vpc: # specify either 'id' or 'cidr'
  # id: my-vpc
    cidr: 10.250.0.0/16
  # gardenerManagedNATGateway: true
  zones:
  - name: eu-central-1a
    workers: 10.250.1.0/24
  # natGateway:
    # eipAllocationID: eip-ufxsdg122elmszcg

The networks.vpc section describes whether you want to create the shoot cluster in an already existing VPC or whether to create a new one:

  • If networks.vpc.id is given then you have to specify the VPC ID of the existing VPC that was created by other means (manually, other tooling, …).
  • If networks.vpc.cidr is given then you have to specify the VPC CIDR of a new VPC that will be created during shoot creation. You can freely choose a private CIDR range.
  • Either networks.vpc.id or networks.vpc.cidr must be present, but not both at the same time.
  • When networks.vpc.id is present, in addition, you can also choose to set networks.vpc.gardenerManagedNATGateway. It is by default false. When it is set to true, Gardener will create an Enhanced NATGateway in the VPC and associate it with a VSwitch created in the first zone in the networks.zones.
  • Please note that when networks.vpc.id is present, and networks.vpc.gardenerManagedNATGateway is false or not set, you have to manually create an Enhance NATGateway and associate it with a VSwitch that you manually created. In this case, make sure the worker CIDRs in networks.zones do not overlap with the one you created. If a NATGateway is created manually and a shoot is created in the same VPC with networks.vpc.gardenerManagedNATGateway set true, you need to manually adjust the route rule accordingly. You may refer to here.

The networks.zones section describes which subnets you want to create in availability zones. For every zone, the Alicloud extension creates one subnet:

  • The workers subnet is used for all shoot worker nodes, i.e., VMs which later run your applications.

For every subnet, you have to specify a CIDR range contained in the VPC CIDR specified above, or the VPC CIDR of your already existing VPC. You can freely choose these CIDR and it is your responsibility to properly design the network layout to suit your needs.

If you want to use multiple availability zones then add a second, third, … entry to the networks.zones[] list and properly specify the AZ name in networks.zones[].name.

Apart from the VPC and the subnets the Alicloud extension will also create a NAT gateway (only if a new VPC is created), a key pair, elastic IPs, VSwitches, a SNAT table entry, and security groups.

By default, the Alicloud extension will create a corresponding Elastic IP that it attaches to this NAT gateway and which is used for egress traffic. The networks.zones[].natGateway.eipAllocationID field allows you to specify the Elastic IP Allocation ID of an existing Elastic IP allocation in case you want to bring your own. If provided, no new Elastic IP will be created and, instead, the Elastic IP specified by you will be used.

⚠️ If you change this field for an already existing infrastructure then it will disrupt egress traffic while Alicloud applies this change, because the NAT gateway must be recreated with the new Elastic IP association. Also, please note that the existing Elastic IP will be permanently deleted if it was earlier created by the Alicloud extension.

ControlPlaneConfig

The control plane configuration mainly contains values for the Alicloud-specific control plane components. Today, the Alicloud extension deploys the cloud-controller-manager and the CSI controllers.

An example ControlPlaneConfig for the Alicloud extension looks as follows:

apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
kind: ControlPlaneConfig
csi:
  enableADController: true
# cloudControllerManager:
#   featureGates:
#     SomeKubernetesFeature: true

The csi.enableADController is used as the value of environment DISK_AD_CONTROLLER, which is used for AliCloud csi-disk-plugin. This field is optional. When a new shoot is creatd, this field is automatically set true. For an existing shoot created in previous versions, it remains unchanged. If there are persistent volumes created before year 2021, please be cautious to set this field true because they may fail to mount to nodes.

The cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.

WorkerConfig

The Alicloud extension does not support a specific WorkerConfig. However, it supports additional data volumes (plus encryption) per machine. By default (if not stated otherwise), all the disks are unencrypted. For each data volume, you have to specify a name. It also supports encrypted system disk. However, only Customized image is currently supported to be used as a basic image for encrypted system disk. Please be noted that the change of system disk encryption flag will cause reconciliation of a shoot, and it will result in nodes rolling update within the worker group.

The following YAML is a snippet of a Shoot resource:

spec:
  provider:
    workers:
    - name: cpu-worker
      ...
      volume:
        type: cloud_efficiency
        size: 20Gi
        encrypted: true
      dataVolumes:
      - name: kubelet-dir
        type: cloud_efficiency
        size: 25Gi
        encrypted: true

Example Shoot manifest (one availability zone)

Please find below an example Shoot manifest for one availability zone:

apiVersion: core.gardener.cloud/v1beta1
kind: Shoot
metadata:
  name: johndoe-alicloud
  namespace: garden-dev
spec:
  cloudProfileName: alicloud
  region: eu-central-1
  secretBindingName: core-alicloud
  provider:
    type: alicloud
    infrastructureConfig:
      apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
      kind: InfrastructureConfig
      networks:
        vpc:
          cidr: 10.250.0.0/16
        zones:
        - name: eu-central-1a
          workers: 10.250.0.0/19
    controlPlaneConfig:
      apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
      kind: ControlPlaneConfig
    workers:
    - name: worker-xoluy
      machine:
        type: ecs.sn2ne.large
      minimum: 2
      maximum: 2
      volume:
        size: 50Gi
        type: cloud_efficiency
      zones:
      - eu-central-1a
  networking:
    nodes: 10.250.0.0/16
    type: calico
  kubernetes:
    version: 1.28.2
  maintenance:
    autoUpdate:
      kubernetesVersion: true
      machineImageVersion: true
  addons:
    kubernetesDashboard:
      enabled: true
    nginxIngress:
      enabled: true

Example Shoot manifest (two availability zones)

Please find below an example Shoot manifest for two availability zones:

apiVersion: core.gardener.cloud/v1beta1
kind: Shoot
metadata:
  name: johndoe-alicloud
  namespace: garden-dev
spec:
  cloudProfileName: alicloud
  region: eu-central-1
  secretBindingName: core-alicloud
  provider:
    type: alicloud
    infrastructureConfig:
      apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
      kind: InfrastructureConfig
      networks:
        vpc:
          cidr: 10.250.0.0/16
        zones:
        - name: eu-central-1a
          workers: 10.250.0.0/26
        - name: eu-central-1b
          workers: 10.250.0.64/26
    controlPlaneConfig:
      apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1
      kind: ControlPlaneConfig
    workers:
    - name: worker-xoluy
      machine:
        type: ecs.sn2ne.large
      minimum: 2
      maximum: 4
      volume:
        size: 50Gi
        type: cloud_efficiency
        # NOTE: Below comment is for the case when encrypted field of an existing shoot is updated from false to true.
        # It will cause affected nodes to be rolling updated. Users must trigger a MAINTAIN operation of the shoot.
        # Otherwise, the shoot will fail to reconcile.
        # You could do it either via Dashboard or annotating the shoot with gardener.cloud/operation=maintain
        encrypted: true
      zones:
      - eu-central-1a
      - eu-central-1b
  networking:
    nodes: 10.250.0.0/16
    type: calico
  kubernetes:
    version: 1.28.2
  maintenance:
    autoUpdate:
      kubernetesVersion: true
      machineImageVersion: true
  addons:
    kubernetesDashboard:
      enabled: true
    nginxIngress:
      enabled: true

Kubernetes Versions per Worker Pool

This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-alicloud@v1.33.

Shoot CA Certificate and ServiceAccount Signing Key Rotation

This extension supports gardener/gardener’s ShootCARotation feature gate since gardener-extension-provider-alicloud@v1.36 and ShootSARotation feature gate since gardener-extension-provider-alicloud@v1.37.