5 minute read
Gardener is committed to providing efficient and flexible Kubernetes cluster management. Traditionally, updates to worker pool configurations, such as machine image or Kubernetes minor version changes, trigger a rolling update. This process involves replacing existing nodes with new ones, which is a robust approach for many scenarios. However, for environments with physical or bare-metal nodes, or stateful workloads sensitive to node replacement, or if the virtual machine type is scarce, this can introduce challenges like extended update times and potential disruptions.
To address these needs, Gardener now introduces In-Place Node Updates. This new capability allows certain updates to be applied directly to existing worker nodes without requiring their replacement, significantly reducing disruption and speeding up update processes for compatible changes.
Gardener now supports three distinct update strategies for your worker pools, configurable via the updateStrategy
field in the Shoot
specification’s worker pool definition:
AutoRollingUpdate
: This is the classic and default strategy. When updates occur, nodes are cordoned, drained, terminated, and replaced with new nodes incorporating the changes.AutoInPlaceUpdate
: With this strategy, compatible updates are applied directly to the existing nodes. The MachineControllerManager (MCM) automatically selects nodes, cordons and drains them, and then signals the Gardener Node Agent (GNA) to perform the update. Once GNA confirms success, MCM uncordons the node.ManualInPlaceUpdate
: This strategy also applies updates directly to existing nodes but gives operators fine-grained control. After an update is specified, MCM marks all nodes in the pool as candidates. Operators must then manually label individual nodes to select them for the in-place update process, which then proceeds similarly to the AutoInPlaceUpdate
strategy.The AutoInPlaceUpdate
and ManualInPlaceUpdate
strategies are available when the InPlaceNodeUpdates
feature gate is enabled in the gardener-apiserver
.
In-place updates are designed to handle a variety of common operational tasks more efficiently:
However, some changes still necessitate a rolling update (node replacement):
Several Gardener components and APIs have been enhanced to support in-place updates:
CloudProfile
API now allows specifying inPlaceUpdates
configuration within machineImage.versions
. This includes a boolean supported
field to indicate if a version supports in-place updates and an optional minVersionForUpdate
string to define the minimum OS version from which an in-place update to the current version is permissible.spec.provider.workers[].updateStrategy
field allows selection of the desired update strategy. Additionally, spec.provider.workers[].machineControllerManagerSettings
now includes machineInPlaceUpdateTimeout
and disableHealthTimeout
(which defaults to true
for in-place strategies to prevent premature machine deletion during lengthy updates). For ManualInPlaceUpdate
, maxSurge
defaults to 0
and maxUnavailable
to 1
.status.inPlaceUpdates.osUpdate
where extensions can specify the command
and args
for the Gardener Node Agent to execute for machine image (Operating System) updates. The spec.inPlaceUpdates
field in the OSC will carry information like the target Operating System version, Kubelet version, and credential rotation status to the node.InPlaceUpdate
with reason ReadyForUpdate
) set by MCM, performs the OS update, Kubelet updates, or credentials rotation, restarts necessary pods (like DaemonSets), and then labels the node with the update outcome.status.inPlaceUpdates.pendingWorkerUpdates
field in the Shoot
now lists worker pools pending autoInPlaceUpdate
or manualInPlaceUpdate
. A new ShootManualInPlaceWorkersUpdated
constraint is added if any manual in-place updates are pending, ensuring users are aware.Worker
extension resource now includes status.inPlaceUpdates.workerPoolToHashMap
to track the configuration hash of worker pools that have undergone in-place updates. This helps Gardener determine if a pool is up-to-date.gardener.cloud/operation=force-in-place-update
annotation can be added to the Shoot to allow subsequent changes or retries.In-place node updates represent a significant step forward in Gardener’s operational flexibility, offering a more nuanced and efficient approach to managing node lifecycles, especially in demanding or specialized environments.
To explore the technical details and contributions that made this feature possible, refer to the following resources: