그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그
3 minute read
Etcd Cluster Components
For every Etcd
cluster that is provisioned by etcd-druid
it deploys a set of resources. Following sections provides information and code reference to each such resource.
StatefulSet
StatefulSet is the primary kubernetes resource that gets provisioned for an etcd cluster.
Replicas for the StatefulSet are derived from
Etcd.Spec.Replicas
in the custom resource.Each pod comprises of two containers:
etcd-wrapper
: This is the main container which runs an etcd process.etcd-backup-restore
: This is a side-container which does the following:- Orchestrates the initialization of etcd. This includes validation of any existing etcd data directory, restoration in case of corrupt etcd data directory files for a single-member etcd cluster.
- Periodically renewes member lease.
- Optionally takes schedule and thresold based delta and full snapshots and pushes them to a configured object store.
- Orchestrates scheduled etcd-db defragmentation.
NOTE: This is not a complete list of functionalities offered out of
etcd-backup-restore
.
Code reference: StatefulSet-Component
For detailed information on each container you can visit etcd-wrapper and etcd-backup-restore respositories.
ConfigMap
Every etcd
member requires configuration with which it must be started. etcd-druid
creates a ConfigMap which gets mounted onto the etcd-backup-restore
container. etcd-backup-restore
container will modify the etcd configuration and serve it to the etcd-wrapper
container upon request.
Code reference: ConfigMap-Component
PodDisruptionBudget
An etcd cluster requires quorum for all write operations. Clients can additionally configure quorum based reads as well to ensure linearizable reads (kube-apiserver’s etcd client is configured for linearizable reads and writes). In a cluster of size 3, only 1 member failure is tolerated. Failure tolerance for an etcd cluster with replicas n
is computed as (n-1)/2
.
To ensure that etcd pods are not evicted more than its failure tolerance, etcd-druid
creates a PodDisruptionBudget.
NOTE: For a single node etcd cluster a
PodDisruptionBudget
will be created, howeverpdb.spec.minavailable
is set to 0 effectively disabling it.
Code reference: PodDisruptionBudget-Component
ServiceAccount
etch-backup-restore
container running as a side-car in every etcd-member, requires permissions to access resources like Lease
, StatefulSet
etc. A dedicated ServiceAccount is created per Etcd
cluster for this purpose.
Code reference: ServiceAccount-Component
Role & RoleBinding
etch-backup-restore
container running as a side-car in every etcd-member, requires permissions to access resources like Lease
, StatefulSet
etc. A dedicated Role and RoleBinding is created and linked to the ServiceAccount created per Etcd
cluster.
Code reference: Role-Component & RoleBinding-Component
Client & Peer Service
To enable clients to connect to an etcd cluster a ClusterIP Client
Service is created. To enable etcd
members to talk to each other(for discovery, leader-election, raft consensus etc.) etcd-druid
also creates a Headless Service.
Code reference: Client-Service-Component & Peer-Service-Component
Member Lease
Every member in an Etcd
cluster has a dedicated Lease that gets created which signifies that the member is alive. It is the responsibility of the etcd-backup-store
side-car container to periodically renew the lease.
Today the lease object is also used to indicate the member-ID and the role of the member in an etcd cluster. Possible roles are
Leader
,Member
(which denotes that this is a member but not a leader). This will change in the future with EtcdMember resource.
Code reference: Member-Lease-Component
Delta & Full Snapshot Leases
One of the responsibilities of etcd-backup-restore
container is to take periodic or threshold based snapshots (delta and full) of the etcd DB. Today etcd-backup-restore
communicates the end-revision of the latest full/delta snapshots to etcd-druid
operator via leases.
etcd-druid
creates two Lease resources one for delta and another for full snapshot. This information is used by the operator to trigger snapshot-compaction jobs. Snapshot leases are also used to derive the health of backups which gets updated in the Status
subresource of every Etcd
resource.
In future these leases will be replaced by EtcdMember resource.
Code reference: Snapshot-Lease-Component