NAT and session-table exhaustion: catching it before connections fail
New connections fail while existing ones keep working. Applications report “connection refused” or timeouts. Open SSH sessions stay alive, but new SSH attempts hang. Your monitoring shows the firewall or NAT gateway is up, interfaces are healthy, and CPU is normal. The session or NAT translation table is full.
Session-table exhaustion is a cliff-edge failure. The table degrades gracefully until it hits its limit, then every new connection is denied. Existing flows continue because their entries are already in the table. The symptom pattern is distinctive but easy to misdiagnose as application failure, DNS issues, or upstream provider problems, because the applications are the ones reporting errors.
The failure surfaces differently per platform but the mechanism is always the same: the translation table or port pool has run out of space. On Linux hosts running conntrack, you see “nf_conntrack: table full, dropping packet” in dmesg. On AWS NAT Gateway, the ErrorPortAllocation metric starts incrementing. On Azure NAT Gateway, FailedConnectionCount rises. On Cisco ASA, syslog event 202010 fires. On FortiGate, the log reads “NAT port is exhausted.”
What this means
A NAT or session table tracks every active connection flowing through a NAT device, firewall, or load balancer. Each entry maps a connection tuple (source IP, source port, destination IP, destination port, protocol) to its translated counterpart and consumes a small amount of memory. In Linux conntrack, each entry uses approximately 300 bytes of nonswappable kernel memory. On cloud NAT gateways and network appliances, the resource being exhausted is typically SNAT source ports rather than raw memory, but the effect is identical: new connections cannot be translated.
The critical behavior is asymmetry. Existing connections continue working because their entries are already in the table. Only new connections fail. This creates a confusing operational picture where some users report problems and others do not, depending on whether their connections were established before or after the table filled.
On Linux conntrack, there is an early-warning stage before the table is technically full. At approximately 87.5% of capacity, the kernel begins proactive eviction of existing entries, incrementing the early_drop counter. At 100%, the kernel stops inserting new flows entirely, increments insert_failed, and logs “nf_conntrack: table full, dropping packet.” Operators who monitor only the count-versus-max ratio miss the early_drop signal and first see the problem when connections start failing.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Session leak (application not closing connections) | Gradual growth in session count over hours or days; count never returns to baseline after traffic subsides | Compare session creation rate against teardown rate |
| DDoS or state-exhaustion attack | Sudden spike to 100%; many connections from few sources spreading across many destination ports | Inspect connection setup rate and source IP distribution |
| Undersized table for current traffic | Sustained utilization above 70% during normal operation; periodic exhaustion during predictable peak windows | Compare peak concurrent connections against configured maximum |
| Overly long timeout retaining stale sessions | Session count high but active connections low; entries persist far beyond expected session lifetime | Compare configured idle/established timeout against actual session duration |
| NAT port exhaustion (PAT-specific) | Exhaustion limited to outbound connections through port-address translation; per-IP port pool depleted | Check per-IP port utilization; AWS ErrorPortAllocation, FortiGate clash counter |
Quick checks
# Linux: check conntrack table utilization ratio
echo "count: $(cat /proc/sys/net/netfilter/nf_conntrack_count) max: $(cat /proc/sys/net/netfilter/nf_conntrack_max)"
# Linux: check conntrack statistics including early_drop and insert_failed
conntrack -S
# Linux: check kernel log for table-full events
dmesg | grep 'nf_conntrack: table full'
# Firewall: check session count via vendor CLI
ssh <fw> 'show session info'
# PAN-OS: check active session count via API
curl -sk "https://<fw>/api/?type=op&cmd=<show><session><info></info></session></show>&key=<apikey>"
# AWS: check NAT gateway for port allocation errors (last hour, 5-min buckets)
aws cloudwatch get-metric-statistics --namespace AWS/NATGateway \
--metric-name ErrorPortAllocation \
--dimensions Name=NatGatewayId,Value=<ngw-id> \
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
--period 300 --statistics Sum
How to diagnose it
flowchart TD
A["New connections failing"] --> B{"Existing connections OK?"}
B -- Yes --> C["Table exhaustion likely"]
B -- No --> D["Check device outage or partition"]
C --> E{"Platform?"}
E --> F["Linux conntrack: conntrack -S"]
E --> G["AWS NAT: ErrorPortAllocation"]
E --> H["Azure NAT: FailedConnectionCount"]
E --> I["FortiGate: clash counter"]
E --> J["Cisco ASA: syslog 202010"]
F --> K["Growth pattern?"]
G --> K
H --> K
I --> K
J --> K
K --> L["Gradual: session leak"]
K --> M["Sudden: DDoS or failover"]
K --> N["Chronic: undersized table"]Confirm the symptom pattern. Verify that new connections fail while existing ones work. Try establishing a new TCP connection to a destination that should be reachable, then check whether an already-open connection to the same destination is still active. This asymmetry distinguishes table exhaustion from a network partition or device outage.
Check session or NAT table utilization. Use the platform-specific command from the Quick checks section. Compare current count against the configured or default maximum. On Linux conntrack, compute
nf_conntrack_count / nf_conntrack_max. On cloud NAT gateways, check the SNAT port allocation metrics. On firewalls, checkshow session infoor the PAN-OS API.Identify the growth pattern. A gradual upward trend that never returns to baseline suggests a session leak. A sudden spike suggests a DDoS or state-exhaustion attack. Sustained high utilization during peak hours with recovery during off-peak suggests an undersized table.
Check the early-warning counters. On Linux conntrack, inspect
early_dropandinsert_failedfromconntrack -Sor/proc/net/stat/nf_conntrack. Risingearly_dropmeans the kernel is already evicting entries proactively. Nonzeroinsert_failedmeans the table was full and connections were denied. On FortiGate, check theclashcounter. On Cisco ASA, search syslog for event 202010.Review timeout configuration. Compare the configured idle timeout and established timeout against the actual session lifetime distribution. On Linux conntrack, check
nf_conntrack_tcp_timeout_established(default 432000 seconds, or 5 days) andnf_conntrack_udp_timeout(default 30 seconds). On Cisco ASA, checktimeout conn(default 1 hour). A misconfigured long timeout causes sessions to linger long after the application has stopped using them.Correlate with connection setup rate. A sudden increase in new connection rate without a corresponding increase in teardown rate will fill any table, regardless of size. Check whether a new application deployment, a misconfigured client with aggressive retry logic, or a scanning tool is driving the spike.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Session/NAT table utilization percentage | Primary leading indicator before the cliff | Sustained above 70%, or trending upward faster than baseline |
early_drop counter (Linux conntrack) | Kernel has begun proactive eviction; table is at 87.5%+ | Any nonzero value, especially if incrementing |
insert_failed counter (Linux conntrack) | Table was full and a connection was denied | Any nonzero value means connections were lost |
ErrorPortAllocation (AWS NAT Gateway) | SNAT ports exhausted on the gateway | Any nonzero value; alarm on greater than 0 for 3 consecutive 5-minute periods |
FailedConnectionCount (Azure NAT Gateway) | SNAT port exhaustion or connection failure | Any nonzero value |
clash counter (FortiGate) | No source ports remain for new sessions | Counter incrementing; may require log scraping or REST API polling on some firmware versions |
| Connection setup rate | Sustained high rate fills the table faster than teardown drains it | Creation rate exceeding teardown rate sustained over minutes |
| Firewall CPU and memory | High session count increases per-packet lookup cost | CPU rising in correlation with session count growth |
Fixes
Increase the table size or NAT pool
On Linux conntrack, raise nf_conntrack_max via sysctl. For a host observing 50,000 peak concurrent flows, set max to at least 200,000 (4x headroom) and buckets to one-quarter of max (approximately 50,000). The bucket count (nf_conntrack_buckets) determines hash table depth. If buckets are too small relative to entry count, each bucket chain grows long, per-chain lock contention increases, and search_restart in /proc/net/stat/nf_conntrack rises. CPU softirq spikes before the table is full.
# Increase conntrack max (runtime, non-disruptive). Adjust value to your workload.
sysctl -w net.netfilter.nf_conntrack_max=524288
Runtime sysctl changes do not persist across reboots. Add them to /etc/sysctl.d/ or equivalent to survive restarts.
On AWS NAT Gateway, each public IP provides 64,512 SNAT ports. Up to 16 IPs are supported per gateway; adding IPs increases the port pool. On Azure NAT Gateway, the same math applies: 64,512 ports per IP, up to 16 IPs. On network appliances, add more IP addresses to the NAT pool or switch from multi-session PAT to per-session PAT.
Reduce timeout values
The default nf_conntrack_tcp_timeout_established of 432,000 seconds (5 days) means a single abandoned TCP connection occupies a table slot for 120 hours. Applications that open and abandon many short-lived connections, such as microservices or Kubernetes pods with aggressive health checks, can exhaust the table faster than the timeout clears it.
# Reduce established timeout (example: 1 hour instead of 5 days)
sysctl -w net.netfilter.nf_conntrack_tcp_timeout_established=3600
On Cisco ASA, reduce timeout conn from the default 1 hour if your traffic profile does not require it. On FortiGate, reduce the NAT session timeout. Tradeoff: shorter timeouts risk closing legitimate idle connections that the application expects to keep alive. Test with your specific workload before deploying.
Fix application connection leaks
If the growth pattern is a slow upward trend that never returns to baseline, the root cause is likely an application that opens connections without closing them. This is the most common cause of chronic session-table exhaustion in microservice environments. Identify the offending workload by inspecting flow data for long-lived connections from specific source IPs or pods. Use conntrack -L with filters on source IP and destination to correlate table entries with specific workloads.
Implement connection rate limiting
For DDoS or state-exhaustion attacks, limit the rate of new connection creation before it reaches the NAT device. Options include upstream rate limiting, connection limits on the firewall, or per-source-IP connection caps. The goal is to shed or defer connections before they consume table entries, preserving capacity for legitimate traffic.
Prevention
Size the table to handle 4x your observed peak concurrent flows. This provides headroom for burst events, failover cascades, and traffic growth. Track the ratio over time and alarm before utilization reaches 70%.
Monitor the early-warning counters, not just the utilization ratio. On Linux conntrack, early_drop fires before insert_failed. On cloud NAT gateways, ErrorPortAllocation and FailedConnectionCount are the equivalent leading indicators. On FortiGate, monitor the clash counter.
Alert on rate of growth, not just absolute utilization. A session count growing more than 5% per minute indicates a DDoS, a failover event, or an application malfunction. Catching the slope is more actionable than waiting for the absolute threshold.
Review timeout configuration during capacity planning. A table sized for 50,000 concurrent flows with a 5-day timeout holds far more entries than the same table with a 1-hour timeout, because stale entries accumulate. Tune timeouts to match your application’s actual connection lifetime distribution.
Track per-zone and per-namespace session counts separately. Linux conntrack limits are per network namespace, not global. Each container or pod gets its own table. On firewalls, session counts may be tracked per zone or per VSYS. Aggregate monitoring that ignores this structure misses exhaustion in one namespace or zone while others look healthy.
Distinguish TCP and UDP SNAT port inventories on cloud NAT gateways. An application exhausting TCP SNAT ports does not affect UDP SNAT availability on the same IP. Some dashboards conflate the two, masking exhaustion in one protocol. Monitor TCP and UDP port utilization independently.
How Netdata helps
- Linux conntrack monitoring. Netdata collects
nf_conntrack_countandnf_conntrack_maxautomatically through its built-in conntrack collector, presenting utilization as a percentage with no manual sysctl polling. - Per-namespace visibility. On hosts running containers, Netdata’s network namespace awareness helps surface conntrack pressure in individual namespaces rather than only the host aggregate.
- Correlation with system-level signals. Session-table exhaustion often correlates with rising CPU softirq time, memory pressure, and interface discard counters. Netdata’s co-located collection lets you correlate these signals on a single timeline without cross-tool lag.
- Connection-state context. Netdata’s network monitoring provides TCP connection-state breakdowns (established, time-wait, syn-sent) that help identify whether a session spike is driven by a burst of new connections or a failure to close existing ones.
- Configurable alerting. Set thresholds at 70% sustained utilization for investigation and 90% for paging, with rate-of-growth alerts to catch DDoS or failover-driven spikes before the table fills.
Related guides
- Correlating cloud VPC flow logs with on-prem NetFlow
- Collector CPU and TSDB write-queue saturation: the capacity signals
- NIC RSS misconfiguration: one CPU core silently dropping your telemetry
- BGP session Established but stale: detecting silent route loss
- ARP cache staleness: when IP-to-MAC mapping goes bad
- Asymmetric routing: why your path and latency measurements lie







