Kubernetes pod Evicted: detection, root cause, and prevention
Pods with status Evicted are not application crashes. They are the kubelet’s emergency response to node-level resource pressure. When memory, disk, inodes, or PIDs approach exhaustion, the kubelet terminates pods to reclaim resources and protect node availability. The pod phase changes to Failed with reason Evicted, and the node reports conditions such as MemoryPressure or DiskPressure.
This guide covers node-pressure eviction triggered by the kubelet, not voluntary disruption from kubectl drain or PodDisruptionBudget enforcement.
What this means
The kubelet continuously evaluates eviction signals against hard or soft thresholds. Hard thresholds trigger immediate pod termination with no graceful shutdown. Soft thresholds pair a threshold with evictionSoftGracePeriod; the kubelet waits until the condition persists past the grace period before acting. If evictionMaxPodGracePeriod is set, the kubelet caps the effective grace period at that value.
The default hard thresholds on Linux nodes are:
memory.available < 100Minodefs.available < 10%imagefs.available < 15%nodefs.inodesFree < 5%imagefs.inodesFree < 5%
The kubelet ranks pods for eviction in two groups. First, BestEffort pods and Burstable pods whose memory usage exceeds their request are evicted, sorted by Priority then by how much usage exceeds the request. Last, Guaranteed pods and Burstable pods whose usage is below their request are evicted, sorted by Priority. QoS class influences this outcome because BestEffort pods always fall into the first group and Guaranteed pods into the second, but the actual mechanism is usage-versus-request and Priority.
Once a node condition transitions to True, the kubelet holds it True for --eviction-pressure-transition-period, which defaults to 5 minutes, even after the signal recovers. This prevents rapid oscillation near the threshold.
Evicted pods remain in the API with phase Failed and reason Evicted. They are not automatically deleted until the controller-manager garbage collector removes them.
flowchart TD
A[Node pressure rises] --> B{Hard threshold crossed?}
B -->|Yes| C[Immediate pod eviction]
B -->|No| D{Soft threshold crossed?}
D -->|Yes| E[Wait for grace period]
E -->|Still breached| C
D -->|No| F[Continue monitoring]
C --> G[Select pod by QoS and usage]
G --> H[Pod phase: Failed/Evicted]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Memory pressure | MemoryPressure=True, OOM kills, memory.available < 100Mi | kubectl describe node and free -m on the node |
| Disk pressure (nodefs) | DiskPressure=True, image pull failures, large container logs or emptyDir volumes | df -h on /var/lib/kubelet and container log sizes |
| Inode exhaustion | DiskPressure=True but disk space is available | df -i on the nodefs mount |
| Imagefs pressure | Container runtime cannot pull or extract images | df -h on the container runtime storage path |
| PID pressure | PIDPressure=True, fork failures in containers | Current PID count versus /proc/sys/kernel/pid_max |
| emptyDir without sizeLimit | Scratch volumes filling nodefs silently | Pod spec for emptyDir and kubelet nodefs metrics |
Quick checks
# List recently evicted pods
kubectl get pods --all-namespaces --field-selector=status.phase=Failed -o json | \
jq '.items[] | select(.status.reason=="Evicted") | "\(.metadata.namespace)/\(.metadata.name)"'
# Check node pressure conditions
kubectl get nodes -o json | \
jq '.items[] | {name: .metadata.name, conditions: [.status.conditions[] | select(.type | test("Pressure|NotReady"))]}'
# Check kubelet eviction metrics from the node
curl -sk https://localhost:10250/metrics | grep kubelet_evictions_total
# Check disk space on nodefs and imagefs
df -h /var/lib/kubelet
df -h /var/lib/containerd
# Check inode usage
df -i /var/lib/kubelet
# Check PID usage on the node
cat /proc/sys/kernel/pid_max
ls -d /proc/[0-9]* 2>/dev/null | wc -l
# Check for emptyDir volumes without sizeLimit
kubectl get pods --all-namespaces -o json | \
jq '.items[] | select(.spec.volumes[]? | .emptyDir != null and (.emptyDir.sizeLimit == null)) | "\(.metadata.namespace)/\(.metadata.name)"'
How to diagnose it
Confirm the event is a node-pressure eviction. Check the pod status:
status.phaseshould beFailedandstatus.reasonshould beEvicted. The status message typically references the kubelet eviction manager.Identify the node and its conditions. Run
kubectl describe node <node>. Look forMemoryPressure,DiskPressure, orPIDPressureset toTrue. ThelastTransitionTimetells you when pressure began.Map the condition to the eviction signal. Check the kubelet logs on the node:
journalctl -u kubelet --since "1 hour ago" | grep -i eviction. Look foreviction manager: attempting to reclaimfollowed by the signal, such asmemory.available,nodefs.available,nodefs.inodesFree, orpid.available.Inspect the actual resource on the node. SSH to the node or use a debug pod with host access. For disk, run
df -handdf -i. For memory, check/proc/meminfoandfree -m. For PID, count processes. Cross-check whether the resource aligns with the signal.Find the heaviest consumer. If memory pressure, identify which pods or system processes are using the most RSS. If disk pressure, check
/var/log/pods/, the container image store, and emptyDir volumes. If inode pressure, look for directories with millions of small files, such as build caches or unrotated log fragments.Check for emptyDir abuse. Pods with emptyDir volumes and no
sizeLimitwrite directly to nodefs. If limits are omitted, large scratch directories can silently cross thenodefs.availablethreshold.Check filesystem topology. If the node uses a single root filesystem, nodefs, imagefs, and container logs all reference the same device. Setting independent thresholds on each can produce unexpected behavior.
Clean up evicted pods. Evicted pods clutter output and consume etcd space at scale. Remove them after confirming they are not needed for debugging.
Warning:
kubectl delete pods --field-selector=status.phase=Faileddeletes every Failed pod in the current namespace. Target specific pods or labels when possible.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
kubelet_evictions_total | Direct count of pods evicted by signal | Any sustained increase |
Node MemoryPressure condition | Precedes memory eviction and OOM kills | True for more than 1 minute |
Node DiskPressure condition | Prevents scheduling and triggers eviction | True for more than 1 minute |
Node PIDPressure condition | Indicates the node cannot create new processes | True for more than 1 minute |
nodefs.available percentage | Disk headroom for logs, emptyDir, and kubelet state | Below 15% |
nodefs.inodesFree percentage | Inode headroom; exhaustion blocks file creation even with space | Below 10% |
imagefs.available percentage | Container runtime storage headroom | Below 15% |
pid.available | Remaining process slots before kubelet eviction | Sustained increase or approaching kubelet threshold |
Fixes
If the cause is memory pressure
Identify the pod or system process consuming the most memory. If a workload is unbounded, add or lower memory limits and requests. If limits are already set but OOM kills continue, the working set has grown; increase the limit or shard the workload. Move memory-bound workloads to larger nodes or add node capacity. If the kubelet itself is consuming excessive memory, check for known memory leaks in your Kubernetes version.
If the cause is disk or inode pressure
Rotate or truncate container logs. Clean orphaned container data and unused images. Set sizeLimit on every emptyDir volume, or use medium: Memory for temporary files. If nodefs and imagefs share a single filesystem, freeing image cache space also helps nodefs. Expand the underlying volume or migrate the runtime store to a dedicated device if they contend. For inode exhaustion, delete directories containing many small files, such as build artifact caches.
If the cause is PID pressure
Identify pods spawning excessive threads or processes. Set --pod-max-pids on the kubelet to contain per-pod fork bombs. Increase pid_max at the OS level if the default is too low for the workload. Check for zombie processes accumulating on the node; ps aux | awk '$8 ~ /^Z/' lists zombie parents that may need intervention.
If the cause is imagefs pressure
Trigger image garbage collection by lowering the high threshold temporarily, or manually prune unused images with crictl rmi --prune. Ensure the container runtime storage path is sized for the number of distinct images in your workload.
If static pods are being evicted
Static pods without a high PriorityClass are evicted alongside other workloads. Assign system-node-critical or system-cluster-critical to static pods that must survive pressure events. The kubelet evicts by Priority, so critical pods are removed last.
Prevention
- Set accurate memory requests and limits so the scheduler does not overcommit nodes with Burstable or BestEffort pods that can grow unbounded.
- Configure
sizeLimiton every emptyDir volume. The kubelet counts emptyDir usage toward nodefs consumption. - Implement log rotation for all containers. Unrotated stdout logs in
/var/log/pods/are a common source of nodefs pressure. - Alert on node pressure conditions before they reach eviction thresholds.
- Monitor inode usage separately from disk space. Small files from caches, sessions, or logs can exhaust inodes while gigabytes remain free.
- Tune kubelet eviction thresholds if the defaults are too aggressive for your node size and workload. Soft eviction gives a grace period; hard eviction is immediate.
- Use Guaranteed QoS and appropriate Priority classes for workloads that must survive pressure events.
- Watch for PID accumulation from applications with large thread pools or frequent short-lived processes.
How Netdata helps
- Correlates
kubelet_evictions_totalwith per-node memory, disk, inode, and PID charts to reveal which signal triggered the eviction. - Surfaces node pressure conditions alongside pod-level resource usage to identify noisy neighbors before eviction.
- Tracks inode utilization separately from disk space, catching exhaustion that leaves capacity free.
- Long-term trends on container runtime disk usage and image cache growth support capacity planning.






