Enhanced Internal Traffic Management: L7 Load Balancing for kube-apiservers in Gardener

Gardener continuously evolves to optimize performance and reliability. A recent improvement focuses on how internal control plane components communicate with kube-apiserver instances, introducing cluster-internal Layer 7 (L7) load balancing to ensure better resource distribution and system stability.

The Challenge: Unbalanced Internal Load on kube-apiservers

Previously, while external access to Gardener-managed kube-apiservers (for Shoots and the Virtual Garden) benefited from L7 load balancing via Istio, internal traffic took a more direct route. Components running within the seed cluster, such as gardener-resource-manager and gardener-controller-manager, would access the kube-apiserver’s internal Kubernetes service directly. This direct access bypassed the L7 load balancing capabilities of the Istio ingress gateway.

This could lead to situations where certain kube-apiserver instances might become overloaded, especially if a particular internal client generated a high volume of requests, potentially impacting the stability and performance of the control plane.

The Solution: Extending L7 Load Balancing Internally

To address this, Gardener now implements cluster-internal L7 load balancing for traffic destined for kube-apiservers from within the control plane. This enhancement ensures that requests from internal components are distributed efficiently across available kube-apiserver replicas, mirroring the sophisticated load balancing already in place for external traffic, but crucially, without routing this internal traffic externally.

Key aspects of this solution include:

  • Leveraging Existing Istio Ingress Gateway: The system utilizes the existing Istio ingress gateway, which already handles L7 load balancing for external traffic.
  • Dedicated Internal Service: A new, dedicated internal ClusterIP service is created for the Istio ingress gateway pods. This service provides an internal entry point for the load balancing.
  • Smart Kubeconfig Adjustments: The kubeconfig files used by internal components (specifically, the generic token kubeconfigs) are configured to point to the kube-apiserver’s public, resolvable DNS address.
  • Automated Configuration Injection: A new admission webhook, integrated into gardener-resource-manager and named pod-kube-apiserver-load-balancing, plays a crucial role. When control plane pods are created, this webhook automatically injects:
    • Host Aliases: It adds a host alias to the pod’s /etc/hosts file. This alias maps the kube-apiserver’s public DNS name to the IP address of the new internal ClusterIP service for the Istio ingress gateway.
    • Network Policy Labels: Necessary labels are added to ensure network policies permit this traffic flow.

With this setup, when an internal component attempts to connect to the kube-apiserver using its public DNS name, the host alias redirects the traffic to the internal Istio ingress gateway service. The ingress gateway then performs L7 load balancing, distributing the requests across the available kube-apiserver instances.

Benefits

This approach offers several advantages:

  • Improved Resource Distribution: Load from internal components is now evenly spread across kube-apiserver instances, preventing hotspots.
  • Enhanced Reliability: By avoiding overloading individual kube-apiserver pods, the overall stability and reliability of the control plane are improved.
  • Internalized Traffic: Despite using the kube-apiserver’s public DNS name in configurations, traffic remains within the cluster, avoiding potential costs or latency associated with external traffic routing.

This enhancement represents a significant step in refining Gardener’s internal traffic management, contributing to more robust and efficiently managed Kubernetes clusters.

Further Information

To dive deeper into the technical details, you can explore the following resources: