SNMP counter discontinuity after reboot: bogus rate spikes explained

A 1-gigabit interface shows 40 terabits per second on your dashboard right after a switch reboot. The traffic never happened. The chart is lying because of how SNMP counters and rate calculations interact.

SNMP interface counters (ifInOctets, ifHCInOctets, ifOutOctets, and friends) are monotonically increasing integers. Your monitoring platform does not read current bandwidth from the device. It subtracts the previous counter value from the current one, divides by elapsed time, and reports the result as a rate. When a counter resets to zero after a reboot or wraps past its maximum, that subtraction produces a physically impossible number. If your alerting or billing pipeline acts on it, you have a problem.

Counter discontinuity: the three causes

Every SNMP-based bandwidth chart is a computed derivative, not a direct measurement:

rate = (current_counter - previous_counter) / elapsed_time

This formula assumes the counter only moves forward. Three things break that assumption:

  1. Counter wrap: a 32-bit counter rolls from 4,294,967,295 back to 0. On a 10G interface at line rate, ifInOctets wraps every ~3.4 seconds. On a 1G link, every ~34 seconds. The subtraction yields a large negative number that some collectors interpret as a massive positive spike via unsigned overflow.

  2. Counter reset after reboot: when a device reloads, all interface counters start from zero. The subtraction yields a large negative number equal to the previous counter value. The spike looks identical to a wrap, but the cause is different.

  3. Counter discontinuity from interface re-creation: deleting and re-creating a logical interface (loopback, port-channel, sub-interface) or performing an OIR (online insertion and removal of a line card) can reset that interface’s counters independently of a device reboot.

All three produce the same symptom: a bogus rate spike. The fix differs for each.

How rate calculation anchors work

The rate calculation depends on two SNMP objects that tell the collector whether a counter delta is trustworthy.

sysUpTime: the agent-level anchor

sysUpTime (.1.3.6.1.2.1.1.3.0) reports hundredths of seconds since the SNMP agent last initialized. When a device reboots, sysUpTime drops to near zero. Any counter delta computed across a poll boundary where sysUpTime decreased must be discarded. The collector should treat the first post-reboot poll as a new baseline, not a rate source.

sysUpTime is itself a 32-bit TimeTicks counter. It wraps after approximately 497 days (4,294,967,295 centiseconds). A sysUpTime wrap looks like a reboot: the value drops to a small number. Distinguish them arithmetically: if current_sysUpTime + (2^32 - previous_sysUpTime) is approximately equal to your poll interval, the device did not reboot; the counter wrapped.

ifCounterDiscontinuityTime: the per-interface anchor

Per RFC 2863 (IF-MIB), ifCounterDiscontinuityTime (.1.3.6.1.2.1.31.1.1.1.19) records the value of sysUpTime at the most recent occasion when one or more of an interface’s counters suffered a discontinuity. If no discontinuity has occurred since the last agent re-initialization, the value is zero.

RFC 2863 mandates that a management application must, when calculating differences between counter values retrieved on successive polls, discard any calculated difference for which ifCounterDiscontinuityTime differs between the two polls. This check is in addition to sysUpTime checking, not a replacement for it.

The two anchors catch different events. sysUpTime catches agent-level restarts. ifCounterDiscontinuityTime catches per-interface events (OIR, logical interface deletion and re-creation) that reset individual counters without resetting the agent.

The decision flow

The following diagram shows how a correctly implemented poller handles the three discontinuity cases:

flowchart TD
    A["Poll N: read counter + sysUpTime"] --> B["Poll N+1: read counter + sysUpTime"]
    B --> C{"sysUpTime decreased?"}
    C -->|"Yes: reboot or wrap"| D{"Wrap arithmetic check?"}
    C -->|"No"| E{"ifCounterDiscontinuityTime
changed between polls?"} D -->|"Matches interval: wrap"| F["Handle as wrap:
delta = current + 2^32 - previous"] D -->|"Does not match: reboot"| G["Discard delta
Re-baseline counter"] E -->|"Yes"| G E -->|"No"| H["Compute normal rate:
delta / elapsed_time"]

Where it shows up in production

After device reboots

The most common trigger. Every interface counter resets to zero when a device reloads. If your collector computes the rate across the reboot boundary, it produces a spike proportional to the previous counter value divided by the poll interval. On a busy interface, that spike can be orders of magnitude above line rate.

The first poll after a reboot must be treated as a re-baseline, not a rate source. sysUpTime detection is the primary mechanism for catching this.

On Cisco IOS: reload does not set ifCounterDiscontinuityTime

On Cisco IOS, a device reload does not populate ifCounterDiscontinuityTime for existing interfaces. The value remains zero because the management subsystem re-initialized cleanly. The discontinuity is signaled by sysUpTime resetting, not by ifCounterDiscontinuityTime changing. Empirical testing confirms that disabling an interface, removing cables, or causing a Counter32 wrap on Cisco IOS does not update ifCounterDiscontinuityTime.

What does populate ifCounterDiscontinuityTime on Cisco IOS is OIR (removing and re-inserting a line card) or deleting and re-creating a logical interface. These events reset individual interface counters independently of the agent restart, and ifCounterDiscontinuityTime is set to the current sysUpTime value.

clear counters does not reset SNMP counters

On Cisco IOS, the clear counters CLI command resets the counters shown in show interface output but does not affect SNMP counters. SNMP counters are independent of CLI counters. Operators sometimes expect the SNMP view to match the CLI view after a reload or clear counters; they will not match on all platforms.

32-bit counters on high-speed interfaces

The original IF-MIB (RFC 1213) defines 32-bit counters. ifXTable (RFC 2863) introduced 64-bit HC variants. RFC 2863 recommends 64-bit counters for interfaces above 20 Mbps and mandates them above 650 Mbps. 64-bit counters require SNMPv2c or SNMPv3; SNMPv1 cannot retrieve them.

A 32-bit ifInOctets counter on a 1G interface wraps every ~34 seconds at line rate. With a 5-minute poll interval, every poll crosses multiple wrap boundaries. The collector cannot reconstruct the true delta from two points. Use ifHCInOctets (.1.3.6.1.2.1.31.1.1.1.6) and ifHCOutOctets (.1.3.6.1.2.1.31.1.1.1.10) instead.

Platform-dependent counter preservation

Counter behavior on interface deletion and re-creation is platform-specific. On Cisco 3560-X, deleting a port-channel zeros the SNMP counters. On the Cisco 4500 family, deleting a port-channel preserves SNMP counters but zeros CLI counters. Do not assume consistent behavior across chassis; validate per-platform.

Common pitfalls

Using 32-bit counters for any interface above 100 Mbps. Wrap time is shorter than most poll intervals. Use ifHCInOctets and ifHCOutOctets. If the device does not support 64-bit counters and you must use 32-bit, poll at least twice as fast as the wrap time and implement wrap-aware differencing.

Trusting ifCounterDiscontinuityTime to catch reboots. On Cisco IOS, it stays zero across a reload. sysUpTime is the only reliable reboot signal. ifCounterDiscontinuityTime catches per-interface discontinuities (OIR, logical interface re-creation), not agent-level restarts.

Conflating sysUpTime wrap with reboot. Both produce a small sysUpTime value. Use the arithmetic check described above. Without it, you will generate false reboot alerts every ~497 days on long-lived devices.

Ignoring vendor-specific sysUpTime quirks. Several platforms have non-standard uptime behavior:

  • MikroTik RouterOS (reported in 6.44.3) has a 32-bit uptime counter whose rollover behavior can trigger spurious device-down alerts if the collector does not handle wrap.
  • F5 BIG-IP exposes hrSystemUptime (`.1.3.6.1.2.1.25.1.1.0`) with non-standard time units on some platforms. Use sysUpTime (`.1.3.6.1.2.1.1.3.0`) for anchoring instead.
  • Windows SNMP agents may return sysUpTime with non-standard resolution. Verify the timebase on your specific Windows SNMP agent before relying on it for poll-interval math.

Assuming ifCounterDiscontinuityTime is available on all agents. It was introduced in RFC 2863 (June 2000). Older agents and some embedded SNMP stacks do not implement it. When it is absent or always zero, fall back to sysUpTime anchoring.

Signals to watch

SignalWhy it mattersWarning sign
sysUpTime (.1.3.6.1.2.1.1.3.0)Primary reboot detection anchor. Any decrease means the agent re-initialized; discard all counter deltas across that boundary.Value drops below previous poll’s value without a corresponding change ticket.
ifCounterDiscontinuityTime (.1.3.6.1.2.1.31.1.1.1.19)Per-interface discontinuity timestamp. A change between polls means that interface’s counter was reset independently of a reboot.Non-zero value on an interface that did not undergo OIR or re-creation.
Interface rate exceeding ifHighSpeedPhysically impossible rate. Indicates a wrap, a reset, or a 32-bit counter on a high-speed link.Spike exceeding the interface’s nominal capacity by any factor.
32-bit counter on interfaces above 100 MbpsWrap time is shorter than typical poll intervals. Rate data is unreliable.ifInOctets polling on a >= 1 Gbps interface without a corresponding ifHCInOctets poll.
sysUpTime approaching 497 dayssysUpTime wrap produces a value indistinguishable from reboot without the arithmetic check.Device uptime approaching the 497-day boundary. Prepare for false reboot alerts.

SNMP collection with Netdata

The Netdata SNMP collector can detect and suppress the bogus spikes described above at the collection layer:

  • When sysUpTime resets between polls, the collector discards the counter delta for that interval and establishes a new baseline instead of emitting a bogus rate.
  • Negative deltas on 32-bit counters are handled via modular arithmetic (delta = current + 2^32 - previous) when the result is consistent with the poll interval.
  • For interfaces above 100 Mbps, configure the collector to poll ifHCInOctets (.1.3.6.1.2.1.31.1.1.1.6) and ifHCOutOctets (.1.3.6.1.2.1.31.1.1.1.10) to eliminate 32-bit wrap entirely.
  • Counter resets appear alongside sysUpTime changes in the same dashboard, making the cause immediately visible without manual log correlation.

To verify your SNMP collector configuration is using 64-bit counters, check that data collection jobs reference the HC OIDs from ifXTable rather than their 32-bit counterparts from IF-MIB.