Kubernetes secrets mount failures: detection and recovery

Pods that reference a Secret volume can hang in ContainerCreating for minutes, crash on startup with missing files, or silently run with stale credentials. The kubelet mounts Secret data as a tmpfs-backed volume and validates the Secret object and its requested keys before any container starts. If the Secret does not exist, if a specific key is absent, or if the kubelet cannot reach the API server, the mount fails. Non-optional mounts block every container from starting. Optional mounts allow startup but leave the mount point empty, which can cause failures later.

What this means

A Secret volume mount is not a lazy file read. The kubelet fetches the Secret object from the API server during volume setup, decodes the data or stringData fields, and writes them into a tmpfs directory that is attached to the container’s filesystem. Because tmpfs is RAM-backed, the data is never written to non-volatile node storage and is deleted when the Pod is removed. If the Pod references a specific key via items[], the kubelet validates that the key exists in the Secret. If the Pod uses subPath to mount an individual key, that mount never receives updates when the Secret changes, regardless of the kubelet’s change detection strategy. For projected volumes, secret sources use name, not secretName, and defaultMode belongs at the top level of the projected volume. Set individual file permissions inside a Secret projection with items[].mode.

Common causes

CauseWhat it looks likeFirst thing to check
Secret does not existPod stuck in ContainerCreating; events reference volume setup failureskubectl get secret <name> -n <ns>
Requested key is absent from SecretPod fails at startup; init containers or apps exit with file-not-found errorskubectl get secret <name> -o jsonpath='{.data}'
Secret created after Pod with optional: falsePod remains ContainerCreating until the Secret appears; no automatic retry on creation orderPod spec optional field and Secret creation timestamp
subPath mount of a Secret keyCredential updates never propagate to running containers; restart appears to be the only fixPod spec subPath usage
Projected volume misconfigurationField errors or silent permission issues after upgradessecretName vs name, defaultMode placement
CSI driver token field rollout skew (v1.35)Volume mounts fail on nodes with older driver after enabling serviceAccountTokenInSecretsCSIDriver spec vs DaemonSet rollout order
Kubelet API server connectivity lossMultiple volume mounts fail, not just Secrets; node may go NotReadyKubelet logs and API server latency from the node
Windows RunAsUser in SecurityContextPod stays permanently ContainerCreating on Windows nodesSecurityContext in Pod spec

Quick checks

# Check Pod events for volume mount failures
kubectl describe pod <pod-name> -n <namespace> | grep -A 10 Events

# Verify the Secret exists in the correct namespace
kubectl get secret <secret-name> -n <namespace>

# Verify the requested keys exist inside the Secret
kubectl get secret <secret-name> -n <namespace> -o jsonpath='{.data}' | jq 'keys'

# Check if the Secret mount is optional
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.spec.volumes[*].secret.optional}'

# Inspect kubelet logs for storage operation errors
journalctl -u kubelet --since "10 minutes ago" | grep -iE "mount|volume|secret"

# Check kubelet API server request latency (replace <node-name>)
kubectl get --raw "/api/v1/nodes/<node-name>/proxy/metrics" | grep rest_client_request_duration_seconds

# Check if subPath is used for the Secret volume
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.spec.containers[*].volumeMounts[?(@.name=="<vol-name>")].subPath}'

# Check projected volume field names and defaultMode placement
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.spec.volumes[*].projected}' | jq .

How to diagnose it

  1. Check the Pod phase and events. If the Pod is stuck in ContainerCreating and events show FailedMount, the kubelet is failing to set up the volume. Read the event message to determine whether the failure is a missing Secret, a missing key, or a timeout talking to the API server.
  2. Verify Secret existence and namespace. A Secret is namespaced. If the Pod references a Secret in a different namespace from where it was created, the kubelet cannot find it. Run kubectl get secret in the Pod’s namespace.
  3. Verify key presence. If the Pod spec uses items[] to project specific keys, each key must exist in the Secret’s data map. A missing key causes a mount failure for non-optional Secrets.
  4. Check the optional flag. If optional: true is set, the kubelet silently ignores a missing Secret and the Pod starts with an empty mount directory. If your application expects a file and finds nothing, check whether the mount was optional.
  5. Inspect for subPath usage. When a Secret is mounted via subPath, the kubelet bind-mounts the file at container startup. If the Secret is updated later, the running container continues to see the old file contents. This is a documented limitation. If credentials rotated but the app sees stale data, subPath is the likely cause.
  6. Validate projected volume syntax. Projected volume secret sources use name; secretName is only valid for standard secret volumes. Ensure defaultMode is set at the projected volume level. Use items[].mode inside a Secret projection to override permissions for individual files.
  7. Check kubelet API connectivity. If multiple Pods on the same node fail to mount Secrets, ConfigMaps, or other API-backed volumes, the kubelet may be unable to reach the API server. Check rest_client_request_duration_seconds from the kubelet metrics endpoint and verify node certificate health.
  8. Review CSI driver rollout order if using v1.35 token fields. If you recently enabled serviceAccountTokenInSecrets: true on a CSIDriver, confirm the DaemonSet rollout completed on all nodes before the spec change. Nodes running the old driver may fail mounts because the old driver ignores the Secrets field.
flowchart TD
    A[Pod stuck in ContainerCreating] --> B{Pod events show FailedMount?}
    B -->|Yes| C[Check Secret exists in Pod namespace]
    B -->|No| D[Check container exit codes and logs]
    C --> E{Secret missing?}
    E -->|Yes| F[Create Secret or fix reference]
    E -->|No| G{Requested key missing?}
    G -->|Yes| H[Add key to Secret or fix items[]]
    G -->|No| I{Multiple volume types failing?}
    I -->|Yes| J[Check kubelet API connectivity and certs]
    I -->|No| K{Using projected volume?}
    K -->|Yes| L[Check name vs secretName and defaultMode]
    K -->|No| M{Using subPath?}
    M -->|Yes| N[Restart Pod to pick up updates]
    M -->|No| O[Check CSI driver rollout order v1.35]

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Pod phase ContainerCreating durationDirect indicator of volume mount stallsSustained > 60 seconds for Secrets-backed Pods
storage_operation_errors_totalCount of kubelet volume operation failuresAny sustained increase correlated with Secret mounts
storage_operation_duration_secondsLatency of attach and mount operationsp99 > 30 seconds on nodes running API-backed volumes
Kubelet rest_client_request_duration_secondsAPI server latency from the node’s perspectivep99 > 5 seconds; risk of Secret fetch timeouts
Container restart count / CrashLoopBackOffApplication crashing because expected Secret files are absentRestarts increasing after Pod scheduling
Kubelet certificate TTLExpired client certs prevent API server authenticationClient certificate TTL < 7 days
Node Ready conditionAPI connectivity loss surfaces as node-level degradationReady=Unknown or flapping Ready status

Fixes

If the Secret or key is missing

Create the Secret in the correct namespace, or patch the Pod spec to reference the correct name. If the Pod uses items[] to select specific keys, ensure every listed key exists in the Secret’s data map. The kubelet retries periodically, but if the Pod has been stuck for minutes, delete and recreate it to force a fresh mount attempt.

If the cause is a subPath stale mount

A subPath mount of a Secret key cannot be updated without container restart. If you need live credential rotation, remove subPath and mount the entire Secret directory. If the application requires a specific file path, use an init container to create a symlink or modify the application to read from the mounted directory. If removal is not possible, schedule a rolling restart of the workload after Secret updates.

If the cause is projected volume misconfiguration

Update the manifest so projected volume secret sources use name. Move defaultMode to the top-level projected volume spec. Use items[].mode inside a Secret projection to override permissions for individual files. Validate the manifest against the target cluster version before applying.

If the cause is kubelet API connectivity

If the kubelet cannot reach the API server, Secret fetches will fail alongside other API-backed volumes. Check the node’s client certificate expiration. The typical path is /var/lib/kubelet/pki/kubelet-client-current.pem:

openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -dates

Verify the API server is reachable from the node. If certificates are expired, approve any pending CertificateSigningRequests and restart the kubelet.

If the cause is CSI driver token field skew (v1.35)

If you enabled serviceAccountTokenInSecrets: true on a CSIDriver before the DaemonSet rolled out everywhere, nodes with the old driver will fail mounts. Revert the CSIDriver spec change, complete the DaemonSet rollout on all nodes, then re-apply the CSIDriver spec change. Do not attempt both changes in a single rolling update.

Prevention

  • Validate manifests in CI. Run admission checks or dry-run applies to catch missing Secrets and invalid projected volume syntax before deployment.
  • Use optional: true only when the application handles empty mounts gracefully. Do not make Secrets optional if the container crashes when the file is absent.
  • Avoid subPath for rotating Secrets. Mount the entire Secret volume when credentials change frequently, so the kubelet can update files without a Pod restart.
  • Monitor kubelet certificate expiration. Track client certificate TTL and alert at 30 days to prevent API authentication failures that block Secret retrieval.
  • Test projected volume changes across versions. The secretName to name transition and defaultMode behavior differ between versions; validate against your cluster’s Kubernetes version.
  • Maintain API server path health from nodes. A node that cannot reach the API server loses the ability to mount any API-backed volume; include API latency and certificate health in node readiness criteria.

How Netdata helps

  • Correlate ContainerCreating duration with kubelet storage_operation_duration_seconds to identify whether delays are volume-level or runtime-level.
  • Alert on container restart spikes and CrashLoopBackOff states that follow Secret mount failures.
  • Track kubelet API server request latency and certificate TTL to catch connectivity issues before they block volume mounts.
  • Monitor node disk and memory pressure that can indirectly slow kubelet volume manager reconciliation.
  • Surface PLEG relist latency to distinguish general runtime stalls from Secret-specific mount errors.