$ guides / kubernetes / kubernetes-kubelet-goroutine-leaks ▌

Operations Guides

Kubernetes kubelet goroutine leaks: detection and bisection

A kubelet goroutine leak is a slow-burn failure. The process stays up, the node often remains Ready for hours, and then PLEG timeouts start, sync loops lag, and the kubelet is eventually OOM-killed or unresponsive. By the time the node flips to NotReady, the original leak signature is usually obscured by secondary symptoms.

What this means

The kubelet spawns goroutines for pod workers, probes, API server watches, volume operations, PLEG relists, and CRI calls. In a healthy node, goroutine count correlates with pod density and returns to a stable floor when churn stops. A leak means goroutines are created but never exit, causing three cascading effects:

Memory growth. Each goroutine consumes stack memory. At scale the aggregate RSS climb is material, and if the kubelet lacks headroom the kernel OOM killer terminates it.
Scheduler pressure. Tens of thousands of leaked goroutines increase Go runtime overhead. The sync loop and PLEG slow down, pushing the node toward NotReady.
Stalled subsystems. A leak tied to pod workers, probes, or watches can exhaust internal pools and block new operations even though the process is still running.

Common causes

Cause	What it looks like	First thing to check
Stuck pod worker goroutines	Goroutine count grows with pod churn; containers stuck in Terminating or Unknown	`kubectl get pods --field-selector spec.nodeName=<node>` for stuck workloads
Probe goroutine accumulation	High density of exec or HTTP probes with short intervals; `prober` stacks dominate pprof	Probe configuration and `prober_probe_total` rate
Watch or informer leaks	Goroutines in `k8s.io/client-go/tools/cache` or transport layers after API server blips	API server connectivity and `rest_client_requests_total` errors
Cadvisor housekeeping accumulation	Goroutines in `housekeeping` paths correlated with transient container failures	Container failure events and `kubelet_pleg_relist_duration_seconds`
CRI streaming / SPDY leaks	Stacks in `spdystream` or `wsstream` after heavy `kubectl exec` or `kubectl logs` usage	Recent interactive sessions and kubectl/kubelet version skew

Quick checks

Run these on the node, or use kubectl get --raw via the API server if the read-only port is disabled.

# Check current goroutine count from kubelet metrics
curl -s http://localhost:10255/metrics | grep kubelet_goroutines

# Get a pprof summary of goroutines
curl -s http://localhost:10255/debug/pprof/goroutine?debug=1 | head -100

# Dump full goroutine stacks for offline analysis; debug=2 includes wait reasons
curl -s http://localhost:10255/debug/pprof/goroutine?debug=2 > /tmp/kubelet-goroutines.txt

# Alternative: fetch pprof via the Kubernetes API node proxy
kubectl get --raw "/api/v1/nodes/<nodename>/proxy/debug/pprof/goroutine?debug=2"

# Count exited containers that may be triggering housekeeping leaks
crictl ps -a --state exited | wc -l

# Check probe load on the node
curl -s http://localhost:10255/metrics | grep prober_probe_total

# Check API client errors from kubelet to the API server
curl -s http://localhost:10255/metrics | grep rest_client_requests_total | grep -E 'code="(4|5)'

# Check kubelet resident memory
grep VmRSS /proc/$(pgrep kubelet)/status

How to diagnose it

Establish a baseline. Record kubelet_goroutines at the current pod count. On a medium-density node, a floor of 100-500 goroutines is normal. If the count climbs while pod count is flat, or exceeds 1,000 without matching workload growth, treat it as a leak.
Capture a goroutine profile. Pull /debug/pprof/goroutine?debug=2 from the kubelet. The dump lists every goroutine grouped by stack trace and wait reason. Look for hundreds of identical stacks in prober, watch, pod_worker, or housekeeping. If the same function accounts for a large fraction of total goroutines, you have isolated the leak family. Save this file before restarting anything.
Correlate the dominant stack with recent events. A stack rooted in spdystream suggests a kubectl exec or portforward session was not torn down cleanly. A stack in housekeeping usually follows a burst of container failures. Stacks in client-go cache or transport code point to watch reconnections after API server disruptions. Match the leak onset to deployments, control plane restarts, or interactive sessions. For example, a sudden jump after a control plane upgrade points to a watch leak, while a steady climb during a batch job with frequent exec probes points to prober accumulation.
Check for upstream bugs. Several cadvisor housekeeping and SPDY stream leaks have been documented and patched. Search the Kubernetes issue tracker for your kubelet version and the dominant stack signature. If you find a matching issue, confirm the fix version in the changelog before scheduling an upgrade. If the stack is novel, you are looking at a candidate for an upstream bug report.
Validate the fix or workaround. Restarting the kubelet clears leaked goroutines immediately, but the leak will recur if the trigger persists. Warning: restarting kubelet is disruptive; the node may briefly flip to NotReady and pods may be rescheduled.
After restart, watch kubelet_goroutines under the same workload pattern. If the count stabilizes, the trigger was external, such as a stuck pod or a brief API server partition. If it climbs again with the same stack signature, the trigger is intrinsic and requires a config change or an upstream fix.

flowchart TD
    A[Goroutine count above baseline] --> B{Is pod count stable?}
    B -->|No| C[Expected scaling; monitor rate]
    B -->|Yes| D[Capture pprof debug=2]
    D --> E{Dominant stack?}
    E -->|prober| F[Reduce exec probes and interval]
    E -->|watch or cache| G[Check API server connectivity]
    E -->|housekeeping| H[Fix container crash loop]
    E -->|spdystream or wsstream| I[Check kubectl exec usage and version skew]
    E -->|pod_worker| J[Find stuck pods and orphaned containers]
    F --> K[Restart kubelet and verify count stabilizes]
    G --> K
    H --> K
    I --> K
    J --> K

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`kubelet_goroutines`	Direct indicator of internal concurrency and leaks	Sustained growth of more than 100 over baseline, or a total above 500 on medium-density nodes
`kubelet_pleg_relist_duration_seconds`	Leaks slow kubelet internals and can be caused by runtime slowness	p99 relist above 10 seconds and climbing
`kubelet_sync_loop_duration_seconds`	Leaked goroutines compete for scheduler time and can stall reconciliation	Sustained duration above 30 seconds
`prober_probe_total`	High probe density creates goroutine pressure that leaks amplify	Probe rate more than 2× baseline without corresponding pod growth
`rest_client_requests_total`	Errors and watch churn spawn orphaned goroutines	Sustained 4xx or 5xx responses from kubelet to the API server
`process_resident_memory_bytes` (kubelet)	Goroutines consume stack memory; RSS growth confirms material impact	RSS climbing in lockstep with goroutine count
Container restart count on node	Crash loops trigger housekeeping goroutine accumulation	More than 10 restarts per minute for the same container
`kubelet_runtime_operations_errors_total`	CRI errors can strand goroutines waiting on runtime responses	Sustained increase in container create or start errors

Fixes

If the cause is stuck pod workers

Look for pods stuck in Terminating or containers in an unknown state. Use crictl ps -a to identify orphaned containers, and crictl inspect on those containers to verify whether the runtime still holds the sandbox. If the runtime has already released the container but the API server still shows the pod, force-delete the pod object only as a last resort. Warning: force-deleting a pod while the kubelet is active can strand volumes and containers.

If the kubelet sync loop is blocked on the stuck worker, a kubelet restart is the fastest recovery, but capture the pprof dump first.

If the cause is probe overload

Reduce probe frequency and avoid sub-5-second intervals. Replace exec probes with httpGet where possible; exec probes fork processes and increase kubelet load. Increase initialDelaySeconds to prevent a startup storm of probe goroutines.

If the cause is watch or informer leaks

Stabilize API server connectivity first. Watches that reconnect in a loop spawn goroutines that may not be reclaimed if the connection is torn down uncleanly. Restart kubelet to clear the backlog. Warning: restarting kubelet is disruptive and may cause brief node NotReady status.

If the leak reproduces during normal API server health, upgrade kubelet to a version without the known informer leak.

If the cause is cadvisor or housekeeping

This pattern is tied to transient container failures. Fix the underlying application crash loop, then restart kubelet to clear the accumulated housekeeping goroutines. There is no runtime command to flush them individually.

If the cause is CRI streaming leaks

Limit automated kubectl exec and kubectl logs usage. Heavy interactive streaming can strand SPDY or websocket goroutines if the client or server version is outside the supported skew window. Keep kubectl and kubelet within plus or minus one minor version.

When to file an upstream bug

File a Kubernetes issue when the goroutine dump shows clear accumulation in kubelet-internal code, the leak reproduces on the latest supported patch release, and it is not explained by a stuck workload or a known CVE. Attach the pprof debug=2 output, the kubelet version, the container runtime version, and the trigger event.

Prevention

Baseline kubelet_goroutines per node density and alert on the rate of change rather than a static threshold.
Cap probe density. Avoid exec probes with short periods and limit the total number of active probes per node.
Monitor container restart rates. Crash loops are a leading trigger for housekeeping goroutine accumulation.
Keep kubelet, kubectl, and the control plane within supported version skew.
Scrape kubelet metrics continuously so a gradual 10 percent daily growth in goroutines surfaces before it becomes an outage. Review probe budgets during application deploys; a spike in prober_probe_total often precedes a leak by hours.

How Netdata helps

Netdata collects kubelet_goroutines, kubelet_pleg_relist_duration_seconds, kubelet memory, and container restart counts, which places leak growth, PLEG latency, and RSS pressure on one timeline. Anomaly detection on goroutine count catches gradual leaks that static thresholds miss, and the Kubernetes node view lets you compare counts across the fleet to isolate node-local triggers from cluster-wide regressions.

The Netdata solution

Kubernetes monitoring with Netdata

Netdata monitors Kubernetes with per-second metrics across the control plane, nodes, and every pod, with ML anomaly detection and zero per-pod configuration. Correlate API-server and etcd latency, kubelet PLEG stalls, scheduling pressure, and OOMKills in one place.

See Kubernetes monitoring → Start monitoring free

Kubernetes kubelet goroutine leaks: detection and bisection

Kubernetes kubelet goroutine leaks: detection and bisection

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

If the cause is stuck pod workers

If the cause is probe overload

If the cause is watch or informer leaks

If the cause is cadvisor or housekeeping

If the cause is CRI streaming leaks

When to file an upstream bug

Prevention

How Netdata helps

Related guides

Kubernetes monitoring with Netdata