Kubernetes etcd snapshot failures: backup, restore, and verification
An etcd snapshot failure usually surfaces during an incident, not during backup. A snapshot from an unhealthy member, a corrupted transfer to object storage, or a restore that writes no data renders disaster recovery useless. This guide gives the checks, commands, and decision logic to verify snapshot integrity, fix backup failures, and perform clean restores.
What this means
An etcd snapshot captures the entire key-value store at a point in time. Because committed Raft log entries exist on a majority of members, a snapshot from any healthy member contains the full cluster state. The snapshot file includes a SHA-256 hash computed at save time. If that hash does not match after transfer, or if the snapshot is taken while etcd is under NOSPACE alarm or leader instability, the file may be inconsistent.
Restoring a snapshot rebuilds a new data directory offline. Every restored member must start with the same --initial-cluster-token and --initial-cluster topology. For Kubernetes, a restored etcd can serve stale revision data to API server watchers unless the revision counter is bumped. etcd 3.6 removed etcdctl snapshot restore; use etcdutl instead.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Snapshot corruption in transit | etcdutl snapshot status reports integrity or CRC errors | SHA-256 hash or re-run status after download |
| NOSPACE alarm blocking writes | etcdctl snapshot save fails or reflects inconsistent state | etcdctl alarm list for NOSPACE |
| etcd 3.6 tooling mismatch | etcdctl snapshot restore returns “command not found” or deprecation fatal | etcdctl version and whether etcdutl is present |
Missing etcdutl in container image | Restore scripts fail inside the etcd static pod | Binary presence in the container image or host path |
| Single-member HA restore with mismatched flags | Cluster fails to start or enters split-brain after restore | --initial-cluster-token and --initial-cluster consistency across nodes |
| Raw db file copied without hash | etcdutl snapshot restore fails with hash mismatch | File provenance: was it from snapshot save or member/snap/ |
Quick checks
# Check local etcd member health
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
endpoint health
# Check DB size and alarms across the cluster
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
endpoint status --cluster --write-out=table
ETCDCTL_API=3 etcdctl alarm list
# Verify snapshot integrity after creation or download
etcdutl snapshot status /path/to/snapshot.db
# Expected: HASH, REVISION, TOTAL KEYS, TOTAL SIZE, STORAGE VERSION
# Check which etcd version and tools are available
etcdctl version
etcdutl version
# For kubeadm static pods, check if etcdutl is in the image
crictl exec $(crictl ps --name etcd -q) etcdutl version
# Compare local hash to source after off-host transfer
sha256sum /path/to/snapshot.db
# Defragment to free space (blocks writes briefly per member)
# WARNING: run during a maintenance window
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
defrag --cluster
How to diagnose it
- Confirm the source member is healthy. A snapshot from a lagging or partitioned member may contain inconsistent data. Run
etcdctl endpoint healthagainst the member, thenetcdctl endpoint status --cluster --write-out=table. The leader should be stable, and followers should show minimal lag in theRAFT TERMandRAFT INDEXcolumns. If the member is unhealthy, switch to another member. - Check for active alarms.
etcdctl alarm listmust return nothing. IfNOSPACEis active, etcd is read-only. No write-dependent operation, including a consistent snapshot, will succeed. Compact old revisions and defragment the cluster before retrying. - Validate tooling against etcd version. If you are on etcd 3.5.x,
etcdctl snapshot restoreworks but emits deprecation warnings. If you are on etcd 3.6.0+, the command is removed andetcdutlis required. Verify binary availability before an incident. - Verify integrity after every transfer.
etcdutl snapshot statusperforms a cryptographic integrity check. Run it immediately after creation and again after downloading from S3 or another store. If the hash changed, the file is corrupt. Do not proceed with a restore. - Inspect restore behavior for silent data loss. If the restore reports success but the restored
--data-dirlacksdbfiles, the snapshot was not written. Preferetcdutl snapshot restoredirectly to avoid delegation bugs in olderetcdctlversions. - Validate HA restore topology. For multi-member restores, every node must use the same
--initial-cluster-tokenand the same--initial-clusterlist. Each node uses its own--nameand--initial-advertise-peer-urls, but the token and member list must be identical. Mismatches cause quorum loss or split-brain.
flowchart TD
A[Snapshot fails or restore fails] --> B{etcdctl alarm list shows NOSPACE?}
B -->|Yes| C[Compact and defrag, then retry]
B -->|No| D{etcdutl snapshot status fails integrity?}
D -->|Yes| E[Re-take snapshot and verify transfer hash]
D -->|No| F{Restore fails or data missing?}
F -->|Yes| G{etcd version >= 3.6?}
G -->|Yes| H[Use etcdutl snapshot restore with --bump-revision and --mark-compacted]
G -->|No| I[Use etcdutl snapshot restore directly avoid etcdctl bug]
F -->|No| J[Verify --initial-cluster and --initial-cluster-token match across all HA members]Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
etcd_mvcc_db_total_size_in_bytes | Approaching quota causes NOSPACE and blocks snapshots | DB size > 75% of --quota-backend-bytes |
etcd_disk_wal_fsync_duration_seconds | High fsync latency indicates disk pressure that can corrupt or slow snapshots | p99 > 100ms sustained |
etcd_server_leader_changes_seen_total | Frequent leader changes mean the cluster is unstable; snapshots taken during churn may be inconsistent | > 0 per hour outside maintenance |
etcd_server_has_leader | A member without a leader cannot guarantee a consistent snapshot | Gauge == 0 |
| API server 5xx rate | etcd distress propagates as API server write failures | Sustained apiserver_request_total{code=~"5.."} > 0.1% |
| Snapshot file hash mismatch | Confirms corruption during transfer or storage | etcdutl snapshot status hash differs from source |
etcdutl binary presence | etcd 3.6 requires etcdutl for restore and status | Binary missing from host or container image |
Fixes
If the snapshot is corrupted or fails integrity checks
Re-take the snapshot from a healthy member. Do not attempt to restore a file that fails etcdutl snapshot status. If corruption occurs during transfer, verify network paths and compare SHA-256 sums at source and destination.
If NOSPACE blocks the backup
Run compaction and defragmentation during a low-traffic window.
REV=$(ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
endpoint status --write-out=json | jq -r '.[0].Status.header.revision')
# WARNING: compaction permanently removes all historical revisions
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
compact "$REV"
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
defrag --cluster
After resolving the space issue, disarm the alarm explicitly:
ETCDCTL_API=3 etcdctl alarm disarm
Then retry the snapshot.
If etcd 3.6 tooling is missing
Migrate all restore and status scripts to etcdutl. If you run etcd as a kubeadm static pod and the container image does not include etcdutl, download the matching etcd release tarball on the host and run etcdutl from the host path.
If restore fails silently or writes no data
Use etcdutl snapshot restore directly instead of delegating through etcdctl.
If HA restore causes split-brain or quorum loss
Restore each member independently with its own --name and --initial-advertise-peer-urls, but pass the exact same --initial-cluster and --initial-cluster-token on every node. For etcd 3.6, --initial-cluster-token is mandatory. Do not restore to one member and attempt to rejoin the others with the old topology.
If restoring to a Kubernetes cluster
Include revision bump flags to invalidate stale watcher caches:
etcdutl snapshot restore snapshot.db \
--bump-revision 1000000000 \
--mark-compacted \
--data-dir /var/lib/etcd
Without this, API server informers may serve pre-snapshot state.
If raw db files lack a hash
Snapshots taken with etcdctl snapshot save include a hash. Raw database files copied from member/snap/ do not. If you must restore from a raw file, pass --skip-hash-check to etcdutl snapshot restore. This bypasses integrity verification and is a last resort.
Prevention
- Automate snapshots and verify them. Schedule
etcdctl snapshot savevia a systemd timer or Kubernetes CronJob, then immediately runetcdutl snapshot status. Push the snapshot to object storage and re-verify after download. - Monitor DB size trends. Track
etcd_mvcc_db_total_size_in_bytesagainst your quota. Enable automatic compaction and schedule periodic defragmentation so the DB does not approach the alarm threshold. - Keep tooling current. etcd 3.6 removed restore and status from
etcdctl. Update runbooks and automation to useetcdutlbefore you need it. - Test restores quarterly. A verified snapshot is only useful if the restore procedure works. Perform a full restore to a temporary environment or isolated nodes to confirm flag correctness and revision bump behavior.
- Protect static pod manifests. On kubeadm clusters, moving manifests out of
/etc/kubernetes/manifests/is required before stopping etcd for a restore. Document this step in your runbook so the kubelet does not restart etcd prematurely.
How Netdata helps
Use Netdata to correlate the following before scheduling snapshots:
etcd_mvcc_db_total_size_in_bytesagainst compaction schedules to predict the nextNOSPACEalarm.etcd_disk_wal_fsync_duration_secondsspikes to avoid I/O-bound backup windows.etcd_server_leader_changes_seen_totalto skip periods of Raft instability.- API server 5xx rate and latency as downstream indicators of etcd distress that can degrade snapshot consistency.
Related guides
- For etcd latency cascades affecting the API server, see Kubernetes API server etcd latency: detection and cascading failures.
- For API server memory pressure during control plane recovery, see Kubernetes API server memory pressure: OOM cycle and tuning.
- For API server unresponsiveness that can complicate etcd recovery, see Kubernetes API server slow or unresponsive: causes and fixes.
- For certificate issues that can block etcd client authentication, see Kubernetes API server certificate rotation: detection and grace handling.






