Redis OOM command not allowed when used memory > 'maxmemory' - causes and fixes

Redis OOM command not allowed when used memory > ‘maxmemory’ - causes and fixes

Redis returns (error) OOM command not allowed when used memory > 'maxmemory'. The server stays online; reads succeed, writes fail. If the client library suppresses errors, the first symptom may be missing data or backend load spikes. This occurs when used_memory reaches maxmemory and the eviction policy cannot free space. Under noeviction, Redis rejects every write and keeps all keys. Under volatile-*, the same happens when no keys carry a TTL. Monitoring often misses this because evicted_keys stays at zero while used_memory sits just below the limit.

flowchart TD
    A[used_memory >= maxmemory] --> B{maxmemory-policy}
    B -->|noeviction| C[Reject writes with OOM error]
    B -->|volatile-*| D{Keys with TTL exist?}
    D -->|No| C
    D -->|Yes| E[Evict keys by policy]
    B -->|allkeys-*| F[Evict any key by policy]

What this means

Redis checks used_memory against maxmemory before executing write commands. When the limit is reached, maxmemory-policy decides what happens. Under noeviction, the server rejects writes with the exact OOM error. Read-only commands such as GET and EXISTS still succeed.

With volatile-lru, volatile-lfu, volatile-random, or volatile-ttl, Redis evicts only keys that have a TTL. If no keys have a TTL, behavior is identical to noeviction: writes are rejected and evicted_keys does not increment.

These rejections increment total_error_replies in INFO stats. Since Redis 6.2, INFO errorstats exposes errorstat_OOM:count. Because OOM rejections happen before execution, they increment rejected_calls in INFO commandstats, not failed_calls. Alerting only on evicted_keys misses this failure mode entirely.

The maxmemory count includes the dataset, client buffers, replication backlog, and Lua memory. It excludes some replication and AOF buffers, which INFO memory reports as mem_not_counted_for_evict. Because of this, used_memory can briefly exceed maxmemory during heavy replication or AOF rewrite without triggering eviction. Once the threshold is crossed under noeviction, writes fail.

Common causes

Cause	What it looks like	First thing to check
`maxmemory-policy` is `noeviction` and dataset grew	Writes fail consistently; `evicted_keys` is 0; `errorstat_OOM` climbing	`CONFIG GET maxmemory-policy`
`volatile-*` policy with no TTLs set	Same OOM error as `noeviction`; `expires` count is 0 in `INFO keyspace`	`INFO keyspace` for `expires`
Client output buffers or replication backlog consuming memory budget	`used_memory` near limit but dataset seems small; large `omem` in `CLIENT LIST`	`CLIENT LIST` and `INFO memory`
`maxmemory` set too close to physical RAM without headroom	`used_memory_rss` near OS limit while `used_memory` is below `maxmemory`; fragmentation high	`INFO memory` for `used_memory_rss` and `mem_fragmentation_ratio`
AWS ElastiCache reserved memory reducing effective capacity	OOM errors on a node that appears undersized; parameter group shows `reserved-memory-percent` at 25%	ElastiCache parameter group

Quick checks

redis-cli INFO memory | grep -E "used_memory:|maxmemory:"

redis-cli CONFIG GET maxmemory-policy

redis-cli INFO stats | grep -E "evicted_keys|total_error_replies"

redis-cli INFO errorstats | grep OOM

redis-cli INFO commandstats | grep rejected_calls

redis-cli CLIENT LIST | grep -o 'omem=[0-9]*' | cut -d= -f2 | sort -rn | head -10

redis-cli INFO memory | grep mem_not_counted_for_evict

How to diagnose it

Confirm the rejection source. Run INFO errorstats (Redis 6.2+) and check errorstat_OOM:count. On older versions, sample total_error_replies twice over ten seconds while issuing a test write. A rising count tied to write attempts confirms OOM rejections.
Check the eviction policy. Run CONFIG GET maxmemory-policy. If it returns noeviction, Redis is refusing writes by design. If it returns a volatile-* policy, run INFO keyspace and verify that expires is non-zero. If expires is 0, there are no candidate keys to evict.
Quantify memory composition. Run INFO memory. Compare used_memory to maxmemory. Review used_memory_dataset and used_memory_overhead. If overhead is high relative to the dataset, inspect CLIENT LIST for large omem values and check connected_slaves to see if replication buffers are consuming space.
Assess physical headroom. Check used_memory_rss against total system memory. If mem_fragmentation_ratio is above 1.5, fragmentation is wasting physical RAM. If used_memory_rss is within 10% of physical memory, the OS OOM killer is a near-term risk even if logical used_memory is below maxmemory.
Check for cloud-specific overhead. On AWS ElastiCache, review the reserved-memory-percent parameter. The effective data memory is the node size minus this reservation. If the reservation is 25% and maxmemory is set to the remaining 75%, you have no further headroom for buffers or bursts.
Correlate with application behavior. Check application logs for the exact error string. Some clients return errors as nil responses or log them at debug level. Identify which commands are failing (SET, HSET, LPUSH) and whether the application retries blindly, amplifying load.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`used_memory` / `maxmemory` ratio	Proximity to the hard limit	Ratio > 0.9 sustained
`evicted_keys` rate	Whether eviction is actually occurring	Stays at 0 while writes fail under `noeviction`
`errorstat_OOM` (Redis 6.2+)	Direct count of OOM rejections	Any sustained increase
`total_error_replies` rate	Aggregate errors including OOM	Rate increases correlated with write latency
`mem_fragmentation_ratio`	Physical memory pressure vs logical usage	> 1.5 (waste) or < 1.0 (swap)
`rejected_calls` in `INFO commandstats`	Per-command visibility into pre-execution rejections	Non-zero for write commands

Fixes

Switch to an allkeys eviction policy

If Redis is a cache and data loss is acceptable, change the policy to evict keys based on access patterns. allkeys-lru removes least recently used keys; allkeys-lfu removes least frequently used. Apply live with CONFIG SET maxmemory-policy allkeys-lru, then update redis.conf or your configuration management to persist it.

Tradeoff: Evicted keys must be reconstructable from a persistent store. Do not use on queues, locks, or primary databases.

Add TTLs to keys

If you use a volatile-* policy, ensure every key that should be eligible for eviction has a TTL. Set TTLs at creation time or run a backfill script. Once TTLs are present, volatile-lru or volatile-ttl will evict keys instead of rejecting writes.

Tradeoff: Requires application changes. Keys without TTLs remain ineligible, so any omitted keys still trigger OOM if they grow unbounded.

Increase maxmemory or shard the dataset

If the dataset is expected to grow and no data can be lost, raise maxmemory with CONFIG SET maxmemory <bytes>. Ensure the host has enough physical RAM for the new limit plus headroom for the operating system, client buffers, and copy-on-write overhead if persistence is enabled. If the dataset exceeds what a single node can hold, shard across multiple Redis instances or cluster nodes.

Tradeoff: Larger instances cost more. Sharding adds operational complexity and client-side routing requirements.

Reduce memory overhead

WARNING: CLIENT KILL drops connections and will interrupt affected applications or subscribers. Audit targets with CLIENT LIST before killing. Stop forgotten MONITOR sessions, which inflate output buffers. Run MEMORY PURGE to ask jemalloc to release dirty pages. If fragmentation is chronic, enable activedefrag yes via CONFIG SET (Redis 4.0+) and persist in redis.conf.

Tradeoff: Active defrag consumes main-thread CPU and can add latency.

Adjust ElastiCache reserved memory

On AWS ElastiCache, if reserved-memory-percent is consuming too much of the node, increase the node type or lower the reservation if your workload allows. Clusters created before March 16, 2017 default to reserved-memory=0, which leaves no margin for failover overhead; newer clusters default to 25%.

Tradeoff: Lowering reserved memory increases failover risk. Changing the parameter group typically requires a reboot.

Prevention

Set maxmemory explicitly. A value of 0 means no limit and eventual OS OOM kill. Every production instance needs a hard ceiling.
Monitor errorstat_OOM or total_error_replies, not just evicted_keys. Relying on eviction counters alone hides noeviction failures.
Match policy to data model. Use allkeys-* for pure caches. Use volatile-* only if you rigorously set TTLs. Use noeviction only when every write must succeed and you have proactive capacity alerts.
Size for overhead. Keep used_memory well below physical RAM. Cache-only nodes should stay below roughly 75% of physical RAM; persistent nodes should stay below roughly 50% to survive fork copy-on-write.
Audit client buffers and big keys. Run CLIENT LIST periodically and use redis-cli --bigkeys or MEMORY USAGE sampling to find outliers before they push you over the limit.

How Netdata helps

Netdata collects used_memory, maxmemory, evicted_keys, and total_error_replies from INFO on the same timeline. This makes it easy to spot write rejections while eviction stays flat.
It tracks mem_fragmentation_ratio and used_memory_rss alongside logical memory usage, distinguishing dataset overflow from overhead bloat.
Per-command rejected_calls from INFO commandstats are collected automatically, showing exactly which write operations are rejected.
Alerts on memory usage ratio and error reply rates surface the failure mode before application errors escalate.

The Netdata solution

Redis monitoring with Netdata

Netdata monitors Redis with per-second metrics and ML anomaly detection. Track memory usage and fragmentation, fork/COW latency, replication backlog, evictions, and connection pressure to spot the failure modes in these runbooks early.

See Redis monitoring → Start monitoring free