Monitoring overlay tunnels (IPsec/GRE/VXLAN): the signals that matter

Overlay tunnels share the underlay’s physical path but add their own failure modes: encapsulation overhead, separate control and data planes, and type-specific state machines. Monitoring only the tunnel interface’s link state is the fundamental trap. An interface reporting “UP” can still be forwarding into a black hole because the peer is unreachable, the SA has expired, or the underlay is dropping fragments.

Three layers of instrumentation must be checked together:

  1. The tunnel interface itself (ip -s link, LOWER_UP flag).
  2. The encapsulation and control-plane state that keeps the tunnel alive (ip xfrm state, bridge fdb, SA lifetimes).
  3. The underlay path carrying the encapsulated packets (interface errors, fragmentation counters, per-hop loss).

The three tunnel families on Linux have different state models. VXLAN is stateless UDP encapsulation with a forwarding database and a well-known fragmentation problem. IPsec (via XFRM) is stateful encryption with SAs that expire and rekey. GRE is stateless point-to-point encapsulation with no built-in liveness detection.

What makes overlay tunnels different

From the network monitoring playbook, the deployment variant that matters here is the SD-WAN overlay: “The tunnel is the unit of interest, not the interface. Underlay vs. overlay confusion is the #1 misdiagnosis.”

Three properties make overlay tunnels a distinct monitoring domain:

  • Encapsulation overhead reduces effective MTU. VXLAN adds 50 bytes per frame; GRE adds 24 bytes; IPsec ESP adds variable overhead depending on cipher and mode. If the underlay MTU does not account for this, fragmentation or silent drops follow.
  • The control plane and data plane are separate. A tunnel can be administratively up while the peer is unreachable, the SA has expired, or the underlay path is degraded. Link state alone is insufficient.
  • State machines vary by tunnel type. VXLAN and GRE are stateless on Linux; IPsec has a rich SA state machine with lifetimes, rekeying, and anti-replay windows.
flowchart TD
    L1["Layer 1: Tunnel interface
ip -s link, LOWER_UP flag"] L2["Layer 2: Encapsulation state
ip xfrm state, bridge fdb, dstport"] L3["Layer 3: Underlay path
ifInErrors, mtr, IpReasmFails"] L1 --> L2 L2 --> L3 L3 -.->|"underlay failure propagates up"| L1 L2 -.->|"SA expiry or FDB loss"| L1

A failure at any lower layer silently degrades the upper layers. The tunnel interface may report UP while the underlay drops fragments, or while an IPsec CHILD SA has expired and the IKE SA remains established with zero traffic flowing.

VXLAN signals

VXLAN encapsulates Ethernet frames in UDP. On Linux, the default UDP destination port is 8472, not the IANA-assigned 4789. This is a persistent interoperability trap in mixed-version or mixed-vendor clusters. Always set dstport 4789 explicitly and verify:

# Verify VXLAN configuration including dstport
ip -d link show vxlan0

Interface and counter state. Use ip -s link show vxlan0 for byte and packet counters. Counter rates matter more than absolute values.

FDB entries. The VXLAN forwarding database maps remote MAC addresses to remote VTEP IPs:

# Show FDB entries for a VXLAN interface
bridge fdb show dev vxlan0

Entries with dst <IP> show remote VTEP mappings. The catch-all BUM entry appears as 00:00:00:00:00:00 dst <IP>, which floods all unknown unicast, broadcast, and multicast traffic to that remote VTEP. Linux kernel VXLAN does not implement IGMP snooping, so multicast traffic floods to all VTEPs unless specific multicast MAC entries are manually inserted into the FDB.

Fragmentation. VXLAN adds 50 bytes of overhead. If the underlay MTU is 1500, the effective VXLAN MTU is 1450. Inner payloads exceeding approximately 1400 bytes trigger fragmentation. Monitor with:

# Check IP fragmentation counters
nstat -az | grep -iE 'IpFrag|IpReasm'

Relevant fields: IpFragCreates, IpReasmReqds, IpReasmFails. Rising IpReasmFails indicates fragments arriving without all pieces, which means drops on the underlay or at an intermediate device.

Offload state. VXLAN forwarding throughput depends on UDP tunnel offload:

# Check VXLAN offload capabilities
ethtool -k eth0 | grep udp_tnl

The rx-udp-gro-forwarding offload knob requires Linux 5.12+ and significantly improves VTEP forwarding throughput for UDP-tunneled traffic.

Kernel bug awareness. CVE-2025-39850 is a NULL pointer dereference in VXLAN {arp,neigh}_reduce() when handling FDB nexthop objects with the proxy option enabled. It triggers during arping on a VXLAN interface with nexthop-group FDB entries. CVE-2025-21790 is a related VXLAN vxlan_vnigroup_init() return-value bug.

If you use proxy mode on VXLAN with BGP EVPN or similar nexthop-group setups, verify your kernel is patched.

IPsec (XFRM) signals

Linux IPsec uses the XFRM framework. The key difference from VXLAN and GRE is that IPsec is stateful: security associations (SAs) have lifetimes, byte limits, sequence numbers, and anti-replay windows. An SA can expire silently if rekeying fails.

SA state and counters. Active SAs:

# List active IPsec SAs with byte/packet counters and lifetimes
ip -s xfrm state list

Each entry shows the SPI, protocol (ESP/AH), byte and packet counters, sequence numbers, anti-replay window, and lifetime (bytes and time remaining). If byte counters plateau near zero on an otherwise-established SA, the CHILD SA may have expired while the IKE SA remains up. No traffic flows, but the session looks healthy at the IKE layer.

Security Policy Database. Query the SPD with:

# List IPsec security policies
ip xfrm policy list

The SPD defines which traffic should be encrypted. Mismatches between policy and state are a common cause of “traffic not encrypted” incidents.

Real-time SA events. Stream creation, deletion, and expiration events:

# Monitor IPsec SA events in real time
ip xfrm monitor

This is the fastest way to detect SA rekeys, expirations, and peer-initiated deletions without polling.

Kernel error and drop counters. The XFRM subsystem exposes per-error counters in /proc/net/xfrm_stat:

# Check IPsec XFRM error counters
cat /proc/net/xfrm_stat

Each row is a numeric counter indexed by XFRM enum: XfrmInError, XfrmInBufferError, XfrmInHmacError, XfrmOutPolSequence, and others. Rising XfrmInHmacError indicates authentication failures, which can mean a key mismatch or packet corruption on the underlay. Rising XfrmInError without a specific subcounter is a general input failure.

strongSwan and VICI. strongSwan exposes SA state via the VICI socket at /var/run/charon.vici. Community Prometheus exporters translate VICI events into metrics. Typical metrics include ipsec_ike_sas, ipsec_ike_sa_state, ipsec_child_sa_bytes_in, ipsec_child_sa_bytes_out, ipsec_child_sa_packets_in, and ipsec_child_sa_packets_out. The older stroke socket interface is deprecated in favor of VICI. strongSwan’s swanctl --list-sas provides a CLI view of the same state.

Dead Peer Detection. DPD behavior varies by implementation. AWS Site-to-Site VPN sends DPD probes every 10 seconds and declares the peer dead after 3 missed responses, for a 30-second total timeout.

strongSwan uses dpdaction to control behavior on peer death: clear (delete SAs), hold (keep SAs, renegotiate when traffic returns), or restart (immediately renegotiate).

Cloud VPN gateways (AWS, OCI, Azure) send DPD probes at fixed intervals. If the on-prem peer does not respond, the cloud side tears down the tunnel. Misconfigured firewall rules that drop IKE (UDP 500) or ESP (IP protocol 50) cause silent tunnel drops.

Capturing ESP traffic. ESP packets traverse the underlay as raw IP protocol 50:

# Capture ESP traffic on the underlay interface (read-only)
tcpdump -i eth0 proto 50

Note: NAT-T encapsulates ESP in UDP 4500. Use tcpdump -i eth0 'udp port 4500' if NAT traversal is in play.

Dirty Frag mitigation interaction. The “Dirty Frag” exploit chain (CVE-2026-43284 / CVE-2026-43500) involves Linux kernel IP fragmentation.

If you run IPsec on an affected host, mitigations that disable esp4 or esp6 kernel modules may interfere with IPsec operation. Confirm module status before applying workarounds:

# Check IPsec ESP kernel module status
lsmod | grep -E '^(esp4|esp6)'

Note: on many modern kernels, ESP support is built-in rather than modular, so lsmod may show nothing even when IPsec is functioning normally.

GRE signals

GRE is the simplest of the three tunnel types on Linux: a stateless point-to-point encapsulation with no built-in control plane. This simplicity creates its own monitoring trap. GRE tunnels silently blackhole when the remote peer goes offline because there is no equivalent to Cisco-style GRE keepalives in the native Linux kernel. Traffic flows into the tunnel and disappears with no ICMP indication.

Interface state. GRE interfaces commonly report state UNKNOWN even when healthy:

# Check GRE tunnel interface state and flags
ip link show gre0

The operative signal is the LOWER_UP flag, not the state field. An interface in state UNKNOWN with LOWER_UP is functioning. An interface in state UNKNOWN without LOWER_UP has a carrier-level problem.

Counters. Byte and packet counters per tunnel:

# Check GRE tunnel byte and packet counters
ip -s link show gre0

Counter rates matter more than absolute values. Use sar, watch, or a metrics collector for continuous monitoring.

Liveness detection. Because Linux GRE is stateless, you must instrument external liveness detection:

  • ip monitor link detects local interface transitions, not remote peer failure.
  • Ping through the tunnel to a known address on the remote side detects remote peer failure.
  • BFD under the tunnel (via FRR or similar) provides sub-second failure detection.

MTU and MSS clamping. GRE reduces the effective MTU by 24 bytes. For standard Ethernet (1500 bytes), the GRE MTU is 1476. Without MSS clamping, TCP connections hang on large transfers but work for small ones. This is a classic MTU blackhole symptom.

# Clamp TCP MSS to PMTU for GRE tunnel traffic
# WARNING: this modifies the live mangle table. Test on a staging host first.
iptables -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN -o gre0 -j TCPMSS --clamp-mss-to-pmtu

Prometheus metrics. The node_exporter exposes standard network interface metrics for GRE tunnels: node_network_up, node_network_receive_bytes_total, node_network_transmit_bytes_total, and node_network_receive_drop_total.

Underlay versus overlay confusion

This is the dominant misdiagnosis in overlay monitoring, captured in the pattern “SD-WAN Tunnel Control-Plane Up, Data-Plane Degraded.” The tunnel interface reports UP from the control plane while the data plane is degraded with high loss, high jitter, or packet reordering. Interface counters do not reflect the problem because they show the tunnel as up.

To disambiguate:

  1. Check the underlay interface for errors and discards (ifInErrors, ifOutErrors, ifInDiscards, ifOutDiscards).
  2. Run mtr from a known-good source to the tunnel endpoint to measure per-hop loss and latency.
  3. Check BFD session state if BFD is used under the tunnel.
  4. Compare forward and reverse path probes separately to detect asymmetric routing. See asymmetric routing.

For cloud VPN tunnels, the cloud provider’s own metrics are authoritative. Oracle Cloud Infrastructure exposes TunnelState, PacketsReceived, PacketsSent, BytesReceived, BytesSent, and PacketsError in the oci_vpn namespace.

These tell you what the cloud side sees, which may differ from what your on-prem device reports.

Signals to watch in production

SignalWhy it mattersWarning sign
GRE LOWER_UP flagThe only reliable liveness signal for stateless GRE tunnelsFlag present but no traffic flowing = remote peer blackholing
VXLAN dstportLinux defaults to 8472, not IANA 4789Mixed-version clusters silently fail interop
IpReasmFails (VXLAN)Fragments arriving incomplete indicate underlay dropsCounter rising during high-throughput windows
IPsec SA byte countersCounters plateau near zero = CHILD SA expired, IKE SA still upSA appears established but no traffic flows
/proc/net/xfrm_stat errorsPer-error counters for XFRM input and output failuresRising XfrmInHmacError = key mismatch or corruption
ip xfrm monitor eventsReal-time SA creation, deletion, and expirationUnexpected SA deletion outside the rekey window
VXLAN FDB entry countFDB exhaustion causes flooding to all VTEPsSudden spike = possible MAC flood or topology change
DPD timeout (cloud VPN)Cloud tears down tunnel after missed DPD probesFirewall drops on IKE (UDP 500) or ESP (proto 50)
GRE MTU and MSS clampingFragmentation causes silent blackholes for large packetsTCP connections hang on large transfers, work for small ones
Underlay interface errorsPhysical-layer issues on the path carrying encapsulated trafficRising ifInErrors on underlay during tunnel degradation

What Netdata collects

Netdata collects the signals overlay tunnel monitoring requires at 1-second (per-second) resolution:

  • Interface-level metrics for tunnel interfaces (VXLAN, GRE, IPsec XFRM interfaces): byte counters, packet counters, drop counters. Per-second collection catches microbursts and short-lived degradation that 5-minute SNMP polling misses.
  • Kernel SNMP counters including fragmentation statistics (IpFragCreates, IpReasmReqds, IpReasmFails), so you can see when VXLAN overhead triggers fragmentation on the underlay.
  • Per-CPU softirq and interrupt distribution metrics help diagnose single-core bottlenecks on VXLAN VTEPs processing high packet rates, a symptom of RSS misconfiguration.
  • Network interface state changes detected in real time, so GRE LOWER_UP flag transitions are caught immediately even though state remains UNKNOWN.
  • Cross-layer correlation: a tunnel interface counter drop visible alongside a rising XfrmInHmacError or IpReasmFails on the same host, in the same dashboard, at the same second.