BGP flapping: why a peer keeps resetting and how to find the cause

A BGP peer cycling between Established and Idle is sending a specific signal. The session tears down because one side sent a NOTIFICATION message, and that message carries an error code and subcode that pinpoints the cause. Most monitoring watches only the FSM state (up or down) and ignores the NOTIFICATION payload, so the operator sees flapping without knowing why.

The second trap is treating the symptom. Clearing the session or increasing the hold timer does not fix the underlying cause. The session re-establishes briefly, then drops again identically. The hold timer is not the problem; the hold timer is detecting the problem.

What this means

BGP flapping is repeated transitions out of Established. Each transition is triggered by one of three events: the local router sends a NOTIFICATION (error detected locally), the remote router sends a NOTIFICATION (error detected remotely), or the TCP session fails (transport loss). The NOTIFICATION message is the single most important diagnostic artifact, and its error code determines the entire investigative path.

The BGP finite state machine progresses: Idle, Active (TCP attempts), OpenSent, OpenConfirm, Established. A peer stuck in Active means TCP connectivity is failing (neighbor unreachable, firewall blocking port 179, or peer down). A peer that reaches Established and then drops back to Idle means the session was established and then torn down by a NOTIFICATION or transport failure.

flowchart TD
    A[BGP peer flapping] --> B{Read NOTIFICATION code}
    B -->|4 Hold Timer Expired| C{CPU high on device?}
    C -->|Yes| D[Control-plane saturation]
    C -->|No| E[MTU black hole or packet loss]
    B -->|6 Cease subcode| F[1 Max Prefixes
2 Admin Shutdown
5 Conn Rejected
10 BFD Down] B -->|2 Open Error| G[Wrong AS or MD5 mismatch] B -->|3 Update Error| H[Malformed route from peer] B -->|No NOTIFICATION| I[TCP or interface failure]

Key BGP NOTIFICATION error codes:

  • Code 2 (Open Message Error): parameter mismatch during session setup. Most commonly wrong remote AS or MD5 authentication failure.
  • Code 3 (Update Message Error): malformed UPDATE from the peer. The peer sent a route the local router could not parse.
  • Code 4 (Hold Timer Expired): the local router did not receive any BGP message (keepalive, UPDATE, or NOTIFICATION) within the negotiated hold time. Hold time defaults vary by vendor; Cisco IOS uses 180/60, Junos uses 90/30. The hold timer expiring means the router heard silence from the peer for the entire hold window.
  • Code 6 (Cease): the session was administratively or policy torn down. The subcode further specifies the reason.

Cease subcodes per RFC 4486 and extensions:

SubcodeMeaningRFC
1Maximum Prefixes Reached4486
2Administrative Shutdown4486
3Peer De-configured4486
4Administrative Reset4486
5Connection Rejected4486
6Other Configuration Change4486
7Connection Collision Resolution4486
8Out of Resources4486
9Hard Reset8538
10BFD Down9384

Common causes

CauseWhat it looks likeFirst thing to check
MTU/PMTUD black holeSession establishes, small keepalives pass, but hold timer expires when UPDATE messages flow. Large BGP UPDATEs are silently dropped.Extended ping with DF bit set toward peer IP at 1400+ bytes
Control-plane CPU saturationHold timer expiry correlates with CPU spikes. Often during massive route table updates or SNMP polling storms.Device CPU counters and process list
Maximum prefix limit (Cease/1)Peer placed in Idle (PfxCt) state after exceeding configured ceiling. Session does not auto-recover.Per-peer prefix count vs configured maximum
MD5 authentication mismatchSession fails to establish or fails right after establishing. Open Message Error or silent TCP failure with %TCP-6-BADAUTH in logs.Router log for auth messages
Interface flap with fast-external-falloverEach BGP reset coincides with an interface transition. Cisco IOS enables fast-external-fallover by default for eBGP.Interface operational status around reset time
BFD-triggered reset (Cease/10)BFD detects path loss faster than BGP hold timer. Reset is immediate after BFD session goes down.BFD session state and underlay path quality
Peer maintenance (Cease/2)Expected reset during a scheduled window. RFC 8203 allows a free-form shutdown message.Change ticket and peer notification

Quick checks

# Check BGP summary for all peers - state and prefix counts
ssh <router> 'show ip bgp summary'

# Get the exact NOTIFICATION reason for a specific peer
ssh <router> 'show ip bgp neighbors <peer> | include notification|Last'

# Check BGP events in syslog
ssh <router> 'show logging | include BGP'

# Poll bgpPeerState via SNMP (RFC 1657 / RFC 4273)
snmpwalk -v2c -c <community> <router> .1.3.6.1.2.1.15.3.1.2

# Check received UPDATE rate per peer via SNMP
snmpwalk -v2c -c <community> <router> .1.3.6.1.2.1.15.3.1.10

# Check control-plane CPU on Cisco
snmpget -v2c -c <community> <device> .1.3.6.1.4.1.9.9.109.1.1.1.1.7
ssh <device> 'show processes cpu sorted | include five sec'

# Check control-plane CPU on Juniper
snmpwalk -v2c -c <community> <device> .1.3.6.1.4.1.2636.3.1.13.1.8

# Check FRR (Linux) BGP state
vtysh -c 'show bgp summary'

How to diagnose it

  1. Decode the NOTIFICATION. Run show ip bgp neighbors <peer> | include notification|Last (Cisco) or show bgp neighbor <peer> (Juniper). The output shows the last reset reason with the error code and subcode. This single line determines the entire investigative path.

  2. For hold timer expiry (code 4), check two things. First, check control-plane CPU on the device. If CPU was above 90% when the session dropped, the BGP process was starved and could not process incoming keepalives in time. Correlate the reset timestamp with CPU history. Second, if CPU was normal, test for an MTU black hole. Run an extended ping with the DF bit set toward the peer IP, stepping up payload sizes from 1400 bytes. If large packets fail while small ones succeed, a middlebox is dropping packets above a certain size and blocking the ICMP “fragmentation needed” response.

  3. For Cease/1 (Maximum Prefixes), check the peer’s prefix count. Compare the current count against the configured maximum-prefix limit. If the peer is sending far more prefixes than usual, investigate a route leak. See BGP route leak and hijack for detection signals.

  4. For Cease/2 (Administrative Shutdown) or Cease/4 (Administrative Reset), verify intent. Check for a change ticket and whether the peer sent an RFC 8203 shutdown message. Modern IOS/XE and Junos releases support this; older releases send the subcode without the human-readable text.

  5. For Cease/10 (BFD Down), check the BFD session state and the underlay path. BFD detects forwarding path failures faster than BGP hold timers. If BFD is triggering, the path is genuinely degraded; the fix is to address the underlay, not to weaken BFD.

  6. For Open Message Error (code 2), verify session parameters. Check the remote AS number and MD5 authentication key on both sides. MD5 mismatches are insidious because the failure happens at the TCP layer before BGP sees the message, so the BGP process may not emit a NOTIFICATION at all. Check the router log for %TCP-6-BADAUTH messages.

  7. For no NOTIFICATION, check the transport layer. Run show tcp brief or equivalent to verify the TCP session. Check the peer-facing interface operational status. If the interface is flapping and the device runs Cisco IOS with bgp fast-external-fallover enabled (the default for eBGP), any interface transition to an eBGP peer immediately resets the BGP session.

  8. Check for a stale Established. Sometimes the FSM reports Established but UPDATE exchange has stopped. Track bgpPeerInUpdates (OID .1.3.6.1.2.1.15.3.1.10) and the timestamp of the last received prefix. If the rate is flat or zero while the session shows Established, a middlebox may be silently dropping UPDATE messages or the peer’s CPU may be too saturated to generate them.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
BGP session state (bgpPeerState)Primary liveness indicator for the peeringRepeated Established-to-Idle transitions; peer in Active for extended time
BGP NOTIFICATION code and subcodePinpoints the exact cause of each resetAny unexpected NOTIFICATION from a critical peer
bgpPeerInUpdates rateDetects stale Established (session up but no data flowing)Rate drops to zero while FSM shows Established
Per-peer prefix countDetects route leaks and maximum-prefix triggersSudden increase (more than 20% in 5 min) or decrease from a peer
Control-plane CPUHigh CPU causes hold timer expiry by starving the BGP processSustained above 90% correlating with session drops
Interface operational statusInterface flap triggers BGP reset if fast-external-fallover is enabledTransitions on the peer-facing interface around reset time
ICMP reachability to peerBasic transport health independent of BGPLoss or latency spike preceding session drop
BFD session stateFaster path-loss detection than BGP hold timersBFD session flapping or down

Fixes

MTU/PMTUD black hole

Configure TCP MSS clamping to prevent BGP UPDATE messages from exceeding the path MTU. On Cisco IOS, use ip tcp adjust-mss <bytes> under the peer-facing interface. On Juniper Junos, use set system tcp-mss. The correct value depends on the smallest MTU along the path; for VPN tunnels, subtract the encapsulation overhead from the physical MTU.

Verify the fix by running the extended ping with DF bit again. Large packets should now succeed.

Control-plane CPU saturation

Identify the process consuming CPU. Common culprits during BGP flapping: massive route table convergence, SNMP polling storms from a collector over-walking the device, or a trap flood from a downstream link-flap cascade. Address the root cause of the CPU spike. If the cause is SNMP polling pressure from your monitoring system, reduce poller concurrency or exclude large MIB table walks from default polling.

Do not increase the BGP hold timer to mask CPU starvation. It delays detection without fixing the underlying problem, and it slows convergence when genuine failures occur.

Maximum prefix limit (Cease/1)

The session does not automatically recover after hitting the maximum-prefix ceiling. An explicit clear is required: clear ip bgp <peer>. This is disruptive; only run it after confirming the peer should be sending that many prefixes.

Investigate why the peer is sending more prefixes than expected. If the count increased suddenly, suspect a route leak from the peer’s upstream. If the limit was set without headroom for organic growth, raise the ceiling. Configure a warning threshold (typically 75-90% of the limit) so you are alerted before the session is torn down.

MD5 authentication mismatch

Verify the MD5 key on both peers. Keys must match exactly. After correcting, the session should re-establish within seconds. Check the router log for %TCP-6-BADAUTH messages to confirm the mismatch was the cause.

Interface flap with fast-external-fallover

If the underlying link to an eBGP peer is unstable and each interface transition causes a BGP reset, consider disabling bgp fast-external-fallover with no bgp fast-external-fallover. This decouples BGP session state from interface state, allowing the BGP hold timer to ride through brief interface flaps. The tradeoff: detection of a genuine link failure is slower (hold timer must expire instead of immediate reset).

BFD-triggered reset (Cease/10)

BFD is doing its job: detecting path degradation faster than BGP would on its own. The fix is to address the underlay path quality, not to remove BFD. If the BFD timers are too aggressive for the path characteristics (lossy wireless, satellite, or VPN underlay), tune the timers rather than disabling BFD entirely.

Prevention

  • Monitor NOTIFICATION subcodes, not just FSM state. A dashboard that shows only “Established” or “Idle” misses the cause. Parse the error code and subcode from syslog, traps, or CLI output and alert on specific subcodes.
  • Track bgpPeerInUpdates rate and last-receive timestamp. This catches stale Established sessions where the FSM reports up but no data is flowing.
  • Proactively clamp TCP MSS on BGP sessions over tunnels. Encapsulation overhead shrinks effective MTU. Clamping prevents the most common silent cause of hold timer expiry.
  • Monitor control-plane CPU trends. CPU above 70% sustained is a leading indicator of hold timer expiry under load.
  • Configure maximum-prefix limits with warning thresholds. Set the hard limit with headroom and the warning at 75-90% so you can act before the session is torn down.
  • Verify MD5 keys as part of change management. Key rotation without coordinated updates on both peers is a common cause of unexpected session drops.
  • Consider BMP (RFC 7854) for deeper visibility. BMP provides Adj-RIB-In (pre-policy and post-policy routes), per-prefix real-time streaming, and withdrawal tracking that BGP4-MIB cannot. BGP4-MIB’s bgp4PathAttrTable only reflects best-path routes.

How Netdata helps

  • Correlates BGP session state transitions with control-plane CPU spikes on the same timeline, making it immediately clear whether hold timer expiry is CPU-induced or path-induced.
  • Parses BGP NOTIFICATION error codes and Cease subcodes from SNMP traps (bgpBackwardTransition, RFC 4273) and syslog, so you see the exact reset reason without manual CLI inspection.
  • Tracks bgpPeerInUpdates rate alongside FSM state to detect stale Established sessions that look healthy but carry no data.
  • Correlates interface operational status with BGP session drops, making fast-external-fallover and link-flap cascades visible in a single view.
  • Monitors per-peer prefix count trends, alerting on sudden increases that may indicate a route leak or an approaching maximum-prefix limit.
  • Integrates ICMP reachability probes to the peer alongside BGP-layer state, so transport-layer loss is visible in context.