그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그
7 minute read
Post-Create Initialization of Machine Instance
Background
Today the driver.Driver facade represents the boundary between the the machine-controller
and its various provider specific implementations.
We have abstract operations for creation/deletion and listing of machines (actually compute instances) but we do not correctly handle post-creation initialization logic. Nor do we provide an abstract operation to represent the hot update of an instance after creation.
We have found this to be necessary for several use cases. Today in the MCM AWS Provider, we already misuse driver.GetMachineStatus
which is supposed to be a read-only operation obtaining the status of an instance.
Each AWS EC2 instance performs source/destination checks by default. For EC2 NAT instances these should be disabled. This is done by issuing a ModifyInstanceAttribute request with the
SourceDestCheck
set tofalse
. The MCM AWS Provider, decodes the AWSProviderSpec, readsproviderSpec.SrcAndDstChecksEnabled
and correspondingly issues the call to modify the already launched instance. However, this should be done as an action after creating the instance and should not be part of the VM status retrieval.Similarly, there is a pending PR to add the
Ipv6AddessCount
andIpv6PrefixCount
to enable the assignment of an ipv6 address and an ipv6 prefix to instances. This requires constructing and issuing an AssignIpv6Addresses request after the EC2 instance is available.We have other uses-cases such as MCM Issue#750 where there is a requirement to provide a way for consumers to add tags which can be hot-updated onto instances. This requirement can be generalized to also offer a convenient way to specify tags which can be applied to VMs, NICs, Devices etc.
We have a need for “machine-instance-not-ready” taint as described in MCM#740 which should only get removed once the post creation updates are finished.
Objectives
We will split the fulfilment of this overall need into 2 stages of implementation.
Stage-A: Support post-VM creation initialization logic of the instance suing a proposed
Driver.InitializeMachine
by permitting provider implementors to add initialization logic after VM creation, return with special new error codecodes.Initialization
for initialization errors and correspondingly support a new machine operation stageInstanceInitialization
which will be updated in the machineLastOperation
. The triggerCreationFlow - a reconciliation sub-flow of the MCM responsible for orchestrating instance creation and updating machine status will be changed to support this behaviour.Stage-B: Introduction of
Driver.UpdateMachine
and enhancing the MCM, MCM providers and gardener extension providers to support hot update of instances throughDriver.UpdateMachine
. The MCM triggerUpdationFlow - a reconciliation sub-flow of the MCM which is supposed to be responsible for orchestrating instance update - but currently not used, will be updated to invoke the providerDriver.UpdateMachine
on hot-updates to to theMachine
object
Stage-A Proposal
Current MCM triggerCreationFlow
Today, reconcileClusterMachine which is the main routine for the Machine
object reconciliation invokes triggerCreationFlow at the end when the machine.Spec.ProviderID
is empty or if the machine.Status.CurrentStatus.Phase
is empty or in CrashLoopBackOff
%%{ init: { 'themeVariables': { 'fontSize': '12px'} } }%% flowchart LR other["..."] -->chk{"machine ProviderID empty OR Phase empty or CrashLoopBackOff ? "}--yes-->triggerCreationFlow chk--noo-->LongRetry["return machineutils.LongRetry"]
Today, the triggerCreationFlow
is illustrated below with some minor details omitted/compressed for brevity
NOTES
- The
lastop
below is an abbreviation formachine.Status.LastOperation
. This, along with the machine phase is generally updated on theMachine
object just before returning from the method. - regarding
phase=CrashLoopBackOff|Failed
. the machine phase may either beCrashLoopBackOff
or move toFailed
if the difference between current time and themachine.CreationTimestamp
has exceeded the configuredMachineCreationTimeout
.
%%{ init: { 'themeVariables': { 'fontSize': '12px'} } }%% flowchart TD end1(("end")) begin((" ")) medretry["return MediumRetry, err"] shortretry["return ShortRetry, err"] medretry-->end1 shortretry-->end1 begin-->AddBootstrapTokenToUserData -->gms["statusResp,statusErr=driver.GetMachineStatus(...)"] -->chkstatuserr{"Check statusErr"} chkstatuserr--notFound-->chknodelbl{"Chk Node Label"} chkstatuserr--else-->createFailed["lastop.Type=Create,lastop.state=Failed,phase=CrashLoopBackOff|Failed"]-->medretry chkstatuserr--nil-->initnodename["nodeName = statusResp.NodeName"]-->setnodename chknodelbl--notset-->createmachine["createResp, createErr=driver.CreateMachine(...)"]-->chkCreateErr{"Check createErr"} chkCreateErr--notnil-->createFailed chkCreateErr--nil-->getnodename["nodeName = createResp.NodeName"] -->chkstalenode{"nodeName != machine.Name\n//chk stale node"} chkstalenode--false-->setnodename["if unset machine.Labels['node']= nodeName"] -->machinepending["if empty/crashloopbackoff lastop.type=Create,lastop.State=Processing,phase=Pending"] -->shortretry chkstalenode--true-->delmachine["driver.DeleteMachine(...)"] -->permafail["lastop.type=Create,lastop.state=Failed,Phase=Failed"] -->shortretry subgraph noteA [" "] permafail -.- note1(["VM was referring to stale node obj"]) end style noteA opacity:0 subgraph noteB [" "] setnodename-.- note2(["Proposal: Introduce Driver.InitializeMachine after this"]) end
Enhancement of MCM triggerCreationFlow
Relevant Observations on Current Flow
- Observe that we always perform a call to
Driver.GetMachineStatus
and only then conditionally perform a call toDriver.CreateMachine
if there was was no machine found. - Observe that after the call to a successful
Driver.CreateMachine
, the machine phase is set toPending
, theLastOperation.Type
is currently set toCreate
and theLastOperation.State
set toProcessing
before returning with aShortRetry
. TheLastOperation.Description
is (unfortunately) set to the fixed message:Creating machine on cloud provider
. - Observe that after an erroneous call to
Driver.CreateMachine
, the machine phase is set toCrashLoopBackOff
orFailed
(in case of creation timeout).
The following changes are proposed with a view towards minimal impact on current code and no introduction of a new Machine Phase.
MCM Changes
- We propose introducing a new machine operation
Driver.InitializeMachine
with the following signaturetype Driver interface { // .. existing methods are omitted for brevity. // InitializeMachine call is responsible for post-create initialization of the provider instance. InitializeMachine(context.Context, *InitializeMachineRequest) error } // InitializeMachineRequest is the initialization request for machine instance initialization type InitializeMachineRequest struct { // Machine object whose VM instance should be initialized Machine *v1alpha1.Machine // MachineClass backing the machine object MachineClass *v1alpha1.MachineClass // Secret backing the machineClass object Secret *corev1.Secret }
- We propose introducing a new MC error code
codes.Initialization
indicating that the VM Instance was created but there was an error in initialization after VM creation. The implementor ofDriver.InitializeMachine
can return this error code, indicating thatInitializeMachine
needs to be called again. The Machine Controller will change the phase toCrashLoopBackOff
as usual when encountering acodes.Initialization
error. - We will introduce a new machine operation stage
InstanceInitialization
. In case of ancodes.Initialization
error- the
machine.Status.LastOperation.Description
will be set toInstanceInitialization
, machine.Status.LastOperation.ErrorCode
will be set tocodes.Initialization
- the
LastOperation.Type
will be set toCreate
- the
LastOperation.State
set toFailed
before returning with aShortRetry
- the
- The semantics of
Driver.GetMachineStatus
will be changed. If the instance associated with machine exists, but the instance was not initialized as expected, the provider implementations ofGetMachineStatus
should return an error:status.Error(codes.Initialization)
. - If
Driver.GetMachineStatus
returned an error encapsulatingcodes.Initialization
thenDriver.InitializeMachine
will be invoked again in thetriggerCreationFlow
. - As according to the usual logic, the main machine controller reconciliation loop will now re-invoke the
triggerCreationFlow
again if the machine phase isCrashLoopBackOff
.
Illustration
AWS Provider Changes
Driver.InitializeMachine
The implementation for the AWS Provider will look something like:
- After the VM instance is available, check
providerSpec.SrcAndDstChecksEnabled
, constructModifyInstanceAttributeInput
and callModifyInstanceAttribute
. In case of an error returncodes.Initialization
instead of the currentcodes.Internal
- Check
providerSpec.NetworkInterfaces
and ifIpv6PrefixCount
is notnil
, then constructAssignIpv6AddressesInput
and callAssignIpv6Addresses
. In case of an error returncodes.Initialization
. Don’t use the genericcodes.Internal
The existing Ipv6 PR will need modifications.
Driver.GetMachineStatus
- If
providerSpec.SrcAndDstChecksEnabled
isfalse
, checkec2.Instance.SourceDestCheck
. If it does not match then returnstatus.Error(codes.Initialization)
- Check
providerSpec.NetworkInterfaces
and ifIpv6PrefixCount
is notnil
, checkec2.Instance.NetworkInterfaces
and check ifInstanceNetworkInterface.Ipv6Addresses
has a non-nil slice. If this is not the case then returnstatus.Error(codes.Initialization)
Instance Not Ready Taint
- Due to the fact that creation flow for machines will now be enhanced to correctly support post-creation startup logic, we should not scheduled workload until this startup logic is complete. Even without this feature we have a need for such a taint as described in MCM#740
- We propose a new taint
node.machine.sapcloud.io/instance-not-ready
which will be added as a node startup taint in gardener core KubeletConfiguration.RegisterWithTaints - The will will then removed by MCM in health check reconciliation, once the machine becomes fully ready. (when moving to
Running
phase) - We will add this taint as part of
--ignore-taint
in CA - We will introduce a disclaimer / prerequisite in the MCM FAQ, to add this taint as part of kubelet config under
--register-with-taints
, otherwise workload could get scheduled , before machine beomesRunning
Stage-B Proposal
Enhancement of Driver Interface for Hot Updation
Kindly refer to the Hot-Update Instances design which provides elaborate detail.