Kubernetes imagePullSecrets: configuration, propagation, and rotation
Pods stuck in ImagePullBackOff are rarely caused by a missing or incorrect image tag. More often, the kubelet lacks valid registry credentials. Kubernetes uses imagePullSecrets, namespaced Secrets of type kubernetes.io/dockerconfigjson, to inject registry auth into a pod. These can be attached directly to the pod spec or propagated through a ServiceAccount.
This guide covers propagation from registry to kubelet, verification at each link, and rotation without forcing a rolling restart of every workload.
What this means
Kubelet pulls images on behalf of the pod. For private registries, it needs a .dockerconfig.json equivalent, stored as a Secret of type kubernetes.io/dockerconfigjson in the same namespace as the pod. Kubelet reads the Secret from the pod spec or from the pod’s ServiceAccount.
Attach the Secret to a ServiceAccount and every pod using that account inherits it automatically. This is the standard pattern for multi-pod authentication. Attach it directly to the pod spec and only that pod receives it. Namespace boundaries are strict: a Secret in namespace A cannot be used by a pod in namespace B. Static pods defined on the node filesystem cannot reference API server secrets; they must rely on kubelet credential provider plugins.
flowchart TD
A[Registry credentials] --> B[Secret type dockerconfigjson]
B --> C{Attached to ServiceAccount?}
C -->|Yes| D[ServiceAccount imagePullSecrets]
C -->|No| E[Pod spec imagePullSecrets]
D --> F[Pod inherits secrets]
E --> F
F --> G[Kubelet pulls image]
G --> H{Auth valid?}
H -->|Yes| I[Container starts]
H -->|No| J[ImagePullBackOff]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Secret name mismatch | ImagePullBackOff with secret not found in events | kubectl get secrets -n <ns> |
| Wrong registry hostname in secret | Pull fails with 401 even though the Secret exists | Decode secret and compare registry URL to image prefix |
| Expired credentials | Intermittent pulls failing after months of stability | Secret creation timestamp and registry token expiry |
| ServiceAccount not patched | New pods fail despite Secret existing | kubectl get sa <sa> -n <ns> -o yaml |
| Pod uses wrong ServiceAccount | Same as above, but default SA is unpatched | Pod spec serviceAccountName |
| Namespace isolation | Secret exists but is in a different namespace | Secret namespace vs pod namespace |
| Static pod limitation | Static pod in /etc/kubernetes/manifests/ cannot pull private image | Check if pod is static; use credential provider plugin instead |
Quick checks
# Check if the secret exists and has the right type
kubectl get secret regcred -n <namespace> -o jsonpath='{.type}'
# Inspect the decoded .dockerconfigjson
kubectl get secret regcred -n <namespace> -o jsonpath="{.data['.dockerconfigjson']}" | base64 -d
# Check which imagePullSecrets are attached to a ServiceAccount
kubectl get serviceaccount default -n <namespace> -o jsonpath='{.imagePullSecrets}'
# Check which ServiceAccount a pod is using
kubectl get pod <pod> -n <namespace> -o jsonpath='{.spec.serviceAccountName}'
# Check kubelet credential provider flags on the node
ps aux | grep kubelet | grep -E 'image-credential-provider'
# Check pod events for the specific pull error
kubectl get events --field-selector involvedObject.name=<pod> -n <namespace> --sort-by='.lastTimestamp'
How to diagnose it
- Read the event message.
Failed to pull imagewith an authentication error means the problem is credentials, not the image tag or network. - Verify the referenced Secret exists in the same namespace. If it does not, create it or correct the reference.
- Decode the Secret’s
.dockerconfigjsonand confirm the registry hostname matches the image prefix exactly. A secret forregistry.example.comwill not authenticate forregistry.example.com:5000unless the port is included. - Check the ServiceAccount. If the pod uses the default ServiceAccount and that account has no
imagePullSecrets, credentials will not be injected. Patch the ServiceAccount, then delete the pod to recreate it. - Verify credential freshness. Old secrets may have expired tokens. Test the credential with
docker loginor the registry CLI. - Check the node type. If the pod is a static pod, it cannot use
imagePullSecrets. Look for kubelet credential provider plugin configuration instead. - If multiple
imagePullSecretsreference the same registry, the container runtime tries them in list order. Overlapping credentials between ServiceAccount and pod sources can create ambiguity. Reduce to one secret per registry.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
Pod phase Pending with ImagePullBackOff | Direct indicator of pull failure | Any pod in this state longer than 5 minutes |
kubelet_runtime_operations_errors_total{operation_type="pull_image"} | Kubelet is failing to pull images | Sustained increase on a node |
kubelet_image_pull_duration_seconds | Registry auth or network latency is degrading | p99 latency trending upward |
| Pod startup latency | Image pull is blocking deployment velocity | p99 startup time > 30s during rollouts |
| Container restart count | CrashLoopBackOff from fallback to missing image | Restart count increasing after pull failures |
Warning events Failed | Kubernetes is surfacing the exact error | Spike in events with reason Failed and message containing Failed to pull image |
Fixes
If the secret is missing or misnamed
Create a new kubernetes.io/dockerconfigjson Secret in the correct namespace with --from-file=.dockerconfigjson=<path> or by constructing the JSON manually. Update the pod spec or ServiceAccount to reference the exact Secret name.
If the ServiceAccount is not propagating the secret
Patch the ServiceAccount to include the secret in its imagePullSecrets array. Existing pods do not pick up the change automatically. Delete the pods to recreate them, or wait for natural churn.
If credentials expired
Rotate the password or token at the registry first. Then update the Secret. If the pod spec references the Secret directly, kubelet reads the updated Secret on the next pull attempt. If the image is already cached on the node, kubelet does not re-pull and will not verify the new credential against the registry until the image is evicted or a new node is used. Plan rotations before expiry or force a pod reschedule to a clean node to validate.
If you are running static pods
Move registry authentication to a kubelet credential provider plugin configured via the kubelet flags --image-credential-provider-bin-dir and --image-credential-provider-config. Static pods cannot read API server secrets.
If you cannot restart pods immediately during rotation
Create a second Secret with the new credentials. Patch the ServiceAccount to reference both secrets temporarily. Wait for natural pod churn to propagate the new credential, then remove the old secret from the ServiceAccount.
Prevention
Attach imagePullSecrets to ServiceAccounts, not individual pods. This centralizes credential management and reduces drift. Use a namespace-scoped Secret per registry. Do not use the discouraged kubernetes.io/dockercfg type.
Prefer kubelet credential provider plugins for cloud registries instead of long-lived secrets where possible. This removes the Secret from the API server entirely and delegates authentication to the node.
Implement a rotation runbook that creates new secrets before invalidating old ones. Avoid in-place Secret updates that rely on kubelet re-reading the same object during runtime, because cached images may mask credential expiry until a new node is provisioned.
Monitor kubelet_runtime_operations_errors_total and pod ImagePullBackOff rates as leading indicators. Set alerts on sustained image pull failures rather than single pod restarts.
How Netdata helps
- Correlates kubelet image pull error rates with pod restart counts to distinguish registry auth failures from application crashes.
- Surfaces pod startup latency spikes during secret rotation events.
- Tracks node-level
ImagePullBackOffpod phases alongside CRI operation latency to pinpoint whether the bottleneck is registry auth, network, or disk I/O. - Alerts on sustained increases in
kubelet_runtime_operations_errors_totalfor pull operations before workloads enterCrashLoopBackOff.
Related guides
- Kubernetes API server certificate rotation: detection and grace handling
- Kubernetes API server etcd latency: detection and cascading failures
- Kubernetes API server memory pressure: OOM cycle and tuning
- Kubernetes API server rate limiting: APF priority levels and starvation
- Kubernetes API server slow or unresponsive: causes and fixes
- Kubernetes API server watch storm: re-list cascades and connection floods
- Kubernetes conntrack exhaustion: dropped connections under load
- Kubernetes controller-manager leader election failures
- Kubernetes CSI driver failures: detection, recovery, and version skew
- Kubernetes DaemonSet pods Pending: scheduling and tolerations
- Kubernetes Deployment rollout stuck: stalled rollouts and ready replicas
- Kubernetes DNS resolution failures inside pods






