ClickHouse background pool saturation: when merges and mutations starve
Insert latency is climbing and DelayedInserts is ticking up. system.merges shows every slot occupied, yet system.parts keeps growing. When the background merge and mutation pool saturates, new merges queue instead of starting, parts accumulate, and the distance to insert rejections shrinks fast. Distinguish true thread starvation from I/O-bound stalls, identify when mutations are the culprit, and relieve pressure before inserts fail.
What this means
ClickHouse consolidates immutable parts using a dedicated thread pool tracked in system.metrics as BackgroundMergesAndMutationsPoolTask. Concurrency is controlled by background_merges_mutations_concurrency_ratio (default 2); the older background_pool_size is deprecated. Merges and mutations share this pool. Separate pools handle fetches, moves, and distributed sends.
Pool saturation means every thread is occupied. The effects are:
- New merges queue.
- Merge completion rate drops below part creation rate.
- Part count per partition rises.
- Inserts delay once
parts_to_delay_insertis crossed, then reject atparts_to_throw_insert.
Causes include too much work (mutations, many small inserts), too little concurrency, or threads stuck on disk I/O. The symptoms look identical in pool metrics, but the fixes differ.
flowchart TD
A[Part count growing] --> B{Pool saturated?}
B -->|Yes| C{Mutations running?}
B -->|No| D[Check insert batch size and partition key]
C -->|Yes| E[Kill or wait for mutations]
C -->|No| F{I/O await high?}
F -->|Yes| G[Reduce competing I/O or add capacity]
F -->|No| H[Increase concurrency ratio or throttle inserts]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Mutation backlog | system.merges shows is_mutation = 1 occupying slots; system.mutations has is_done = 0 with slow or flat parts_to_do | SELECT count(*) FROM system.mutations WHERE is_done = 0 |
| Insert rate exceeds merge throughput | Steady part-count growth, pool at 100%, no mutations, merges running but not fast enough | SELECT event, value FROM system.events WHERE event IN ('InsertQuery','InsertedRows') |
| I/O saturation masquerading as thread starvation | Pool threads are active but merge progress is frozen or crawling; disk await is high | iostat -xz 1 5 and SELECT progress, elapsed FROM system.merges |
| Single large merge hogging slots | One entry in system.merges with very large num_parts, elapsed over an hour, consuming most memory | SELECT elapsed, num_parts, memory_usage FROM system.merges ORDER BY elapsed DESC |
| Many small inserts creating unmergeable overhead | High InsertQuery count but low InsertedRows per query; part count spikes faster than usual | Rows per insert from system.events deltas |
Quick checks
-- Check background pool utilization
SELECT metric, value
FROM system.metrics
WHERE metric LIKE 'Background%Pool%'
ORDER BY metric;
-- Check active merges and whether they are mutations
SELECT
database,
table,
elapsed,
progress,
num_parts,
is_mutation,
formatReadableSize(memory_usage) AS mem
FROM system.merges
ORDER BY elapsed DESC;
-- Check for incomplete mutations
SELECT
database,
table,
mutation_id,
parts_to_do,
is_done,
latest_fail_reason
FROM system.mutations
WHERE is_done = 0
ORDER BY create_time;
-- Check active parts per partition (worst offenders)
SELECT
database,
table,
partition_id,
count(*) AS parts
FROM system.parts
WHERE active = 1
GROUP BY database, table, partition_id
ORDER BY parts DESC
LIMIT 10;
-- Check insert pressure and early warnings
SELECT event, value
FROM system.events
WHERE event IN ('DelayedInserts', 'RejectedInserts', 'InsertQuery', 'InsertedRows');
# Check disk I/O latency during the incident
iostat -xz 1 5
How to diagnose it
- Confirm pool saturation. Query
system.metricsforBackgroundMergesAndMutationsPoolTaskandBackgroundMergesAndMutationsPoolSize. A sustained ratio at or near 1.0 for more than 10 minutes during active inserts means saturation. - See what occupies the slots. Query
system.merges. Ifis_mutation = 1, mutations are competing directly with merges for threads. - Check mutation backlog. Query
system.mutationsforis_done = 0. Ifparts_to_dois flat or failing to decrease, a mutation is stuck or very large. - Correlate with part-count growth. Query
system.partsgrouped bypartition_id. If the max per partition rises while the pool is full, merge throughput is the bottleneck. - Distinguish I/O-bound from CPU/thread-bound stalls. In
system.merges, sampleprogresstwice over 60 seconds. If it barely moves andiostatshows highawait, the threads are waiting on disk, not compute. - Check for insert-pattern changes. Compare
InsertQuerytoInsertedRowsinsystem.events. A sudden jump in query count with flat row count means smaller batches, which creates more parts per row and overwhelms the pool.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
Background pool utilization (BackgroundMergesAndMutationsPoolTask / PoolSize) | Headroom for merge and mutation work | Sustained > 0.9 for > 10 minutes |
| Active parts per partition | Distance to insert throttling and rejection | Within 50% of configured parts_to_delay_insert |
| DelayedInserts | Earliest warning that merges are falling behind | Counter increasing over successive samples |
| Merge progress velocity | Whether active merges are completing or stuck | progress flat for > 5 minutes |
| Disk I/O await | Distinguishes thread starvation from I/O wait | > 20 ms on SSD, or trending sharply up |
Mutation parts_to_do | Mutation backlog consuming pool capacity | Not decreasing over a 30-minute window |
Fixes
Kill or pause heavy mutations
If system.merges shows is_mutation = 1 and system.mutations has stalled work, cancel non-critical mutations to free pool capacity.
-- Identify the mutation
SELECT database, table, mutation_id
FROM system.mutations
WHERE is_done = 0
ORDER BY create_time
LIMIT 1;
-- Kill it. Destructive: partial work is discarded.
KILL MUTATION WHERE database = 'db' AND table = 'table' AND mutation_id = '0000000000';
Tradeoff: Killed mutations must be reissued later. On large tables, even valid mutations may take hours; schedule them during low-traffic windows.
Throttle or batch inserts
If the pool is saturated because inserts arrive faster than merges can consolidate, reduce the part creation rate. Increase batch size so each INSERT creates fewer parts, or temporarily slow the ingest pipeline.
Tradeoff: Upstream systems may need to buffer data or tolerate higher latency.
Increase merge concurrency
If the node has idle CPU and disk headroom, raise background_merges_mutations_concurrency_ratio above its default of 2. This increases the number of concurrent merge and mutation tasks. Configuration changes may not affect work already queued.
Tradeoff: More concurrent merges increase I/O and CPU consumption. Do not raise this if disk await is already elevated.
Address I/O saturation
If iostat shows high await and merge progress is frozen, the pool is not starved for threads; it is starved for disk. Reduce competing load:
- Temporarily route heavy analytical queries away from the node.
- Pause backups or large exports touching the same volumes.
- If you use tiered storage, verify that hot-tier disk bandwidth is not the bottleneck.
Tradeoff: Reduced query capacity or added infrastructure cost.
Reclaim space to unblock merges
If disk space is tight, merges cannot allocate temporary output files and will stall. See ClickHouse disk space collapse: why merges need free space and how the spiral starts for emergency space recovery.
Prevention
- Monitor the merge-to-insert ratio. Track part creation rate against merge completion rate. If the ratio stays above 1.0 for more than a few minutes, the pool is losing ground.
- Avoid mutation storms. Do not issue
ALTER UPDATE/DELETEduring peak ingest. Mutations are full part rewrites, not row updates, and they monopolize pool threads. See ClickHouse ALTER UPDATE/DELETE overuse: why mutations are not row updates. - Batch inserts. Target thousands of rows per INSERT. Many small inserts create parts faster than any pool configuration can merge.
- Size pools to I/O capacity, not just core count. A concurrency ratio of 2 is a starting point. If your storage cannot sustain random I/O from more concurrent merges, raising the ratio will only turn thread starvation into I/O saturation.
How Netdata helps
- Correlate
BackgroundMergesAndMutationsPoolTaskwith active part count andDelayedInsertson the same timeline to spot saturation. - Alert when pool utilization crosses 90% alongside positive part-count growth, catching the bottleneck before inserts delay.
- Surface disk I/O latency next to merge activity to distinguish thread starvation from I/O waits without manual
iostat. - Track mutation backlog and merge duration trends without polling
system.mutations. - Visualize insert latency degradation as an early warning, often hours before
RejectedInsertsappears.
Related guides
- ClickHouse active part count growing: reading MaxPartCountForPartition before it pages
- ClickHouse ALTER UPDATE/DELETE overuse: why mutations are not row updates
- ClickHouse async inserts: when async_insert fixes too-many-parts and when it hides it
- ClickHouse mark cache and uncompressed cache: reading low hit rates
- ClickHouse DelayedInserts climbing: the warning before too-many-parts
- ClickHouse detached parts piling up: reading system.detached_parts and reclaiming space
- ClickHouse disk space collapse: why merges need free space and how the spiral starts
- ClickHouse disk space monitoring: free_space, unreserved_space, and the 80% target
- ClickHouse distributed DDL stuck: ON CLUSTER queries that never finish
- ClickHouse distributed query amplification: one coordinator, many shard subqueries
- ClickHouse full table scan: partition pruning failures and the primary key
- ClickHouse insert latency rising: the leading indicator of write-pipeline trouble







