ClickHouse disk space collapse: why merges need free space and how the spiral starts
Disk usage climbs from 75% to 85% over a week, then hits 90%. A few hours later ClickHouse rejects inserts, part counts spike, and the volume is at 98%. The system did not simply run out of space. It entered a self-reinforcing spiral where background merges, the mechanism that reclaims space, stalled because they needed free space to work.
This guide covers the mechanics, early signals, and recovery steps.
What this means
ClickHouse stores data in immutable parts. Every INSERT creates one or more parts on disk. Background merges combine smaller parts into larger ones to maintain query performance and keep file counts low. A merge reads all source parts and writes a new merged part before deleting the sources, so it temporarily needs space for both the old and new data.
This write-before-delete behavior means merges need roughly 2x the size of the largest active part as temporary free space. Once disk passes 85-90% full, ClickHouse cannot reserve enough space for merged output. Merges stall. New inserts continue creating small parts that cannot be consolidated. Each unmerged part carries metadata overhead and multiple files per column, so disk fills faster than raw data volume suggests. TTL deletes also stop working because TTL cleanup runs during merges. Old data that should expire persists, accelerating consumption. The result is a spiral: less free space blocks merges, blocked merges cause part accumulation, and accumulation consumes the remaining space faster.
flowchart TD
A[Disk passes 85% full] -->|blocks| B[Merges cannot reserve output space]
B -->|halts| C[Background merges stall]
C -->|allows| D[Small parts accumulate]
D -->|adds| E[Metadata and file overhead]
E -->|stops| F[TTL cleanup]
F -->|retains| G[Old data persists]
G -->|drives| H[Inserts delayed then rejected]ClickHouse does not handle disk-full gracefully. Recovery usually requires manual intervention: detaching partitions, adding capacity, or deleting data.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Under-provisioned disk relative to ingest rate | Steady disk growth; merges declining as fullness increases | Daily ingest rate versus TTL or drop rate; system.disks.free_space trend over 24 hours |
| TTL deletes not executing because merges are stalled | Disk growing despite configured TTL; old parts remain in system.parts | Whether merges are running in system.merges and whether old partitions are still active |
| Large merge attempting to write a temporary result | Sudden disk spike during what should be routine compaction | Size of the largest active part in system.parts versus unreserved_space |
| Unlimited retention with growing data | Linear disk growth with no deletion policy | Retention settings and the age distribution of partitions |
Quick checks
Run these read-only checks to assess how close you are to the merge cliff and whether the spiral has started.
-- Check disk space and ClickHouse reservations
SELECT
name,
path,
formatReadableSize(free_space) AS free,
formatReadableSize(total_space) AS total,
round(100 * (1 - free_space / total_space), 1) AS used_pct,
formatReadableSize(unreserved_space) AS unreserved,
formatReadableSize(keep_free_space_bytes) AS keep_free
FROM system.disks;
-- Check active merges and progress
SELECT
database,
table,
elapsed,
progress,
num_parts,
is_mutation,
formatReadableSize(total_size_bytes_compressed) AS total_size
FROM system.merges
ORDER BY elapsed DESC;
-- Count active parts per partition to find accumulation hotspots
SELECT
database,
table,
partition_id,
count(*) AS active_parts,
formatReadableSize(sum(bytes_on_disk)) AS size
FROM system.parts
WHERE active = 1
GROUP BY database, table, partition_id
ORDER BY active_parts DESC
LIMIT 20;
-- Check for insert throttling and hard rejections
SELECT event, value
FROM system.events
WHERE event IN ('DelayedInserts', 'RejectedInserts');
-- Identify largest tables by on-disk footprint
SELECT
database,
table,
formatReadableSize(sum(bytes_on_disk)) AS disk_size
FROM system.parts
WHERE active = 1
GROUP BY database, table
ORDER BY disk_size DESC
LIMIT 10;
# OS-level disk check on the data path
df -h /var/lib/clickhouse
How to diagnose it
- Confirm actual usable headroom. Query
system.disksand compareused_pcttounreserved_space. Ifunreserved_spaceis near zero, ClickHouse already considers the disk exhausted for its own operations, even if the OS reports a few percent remaining. - Verify merge activity. Query
system.merges. On a system with active inserts, zero running merges while part counts are elevated means merges are stuck. Sampleprogresstwice over a 60-second window; if it does not advance, the merge is blocked. - Correlate part growth with disk trend. Query
system.partsgrouped bypartition_id. If active parts are growing while disk is above 85%, merges are likely blocked by insufficient space. - Check for insert backpressure. Query
system.eventsforDelayedInsertsandRejectedInserts. Sustained increases while disk is full confirm the spiral is active and the write pipeline is choking. - Size the largest active part against free space. Query
system.partsfor the largestbytes_on_diskamong active parts. If free space is less than roughly 2x that size, a merge targeting that part cannot complete. - Validate TTL execution. If TTL is configured but old parts remain in
system.parts, merges are too stalled to enforce expiration. Retention policy is not helping you reclaim space.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
Disk used percentage (system.disks) | Proximity to the merge cliff | > 80-85% used |
unreserved_space vs free_space | Actual headroom after ClickHouse reservations | unreserved_space approaching zero |
Active parts per partition (system.parts) | Accumulation when merges stall | Growing steadily while disk > 85% |
Merge activity (system.merges) | Whether compaction is happening | Zero merges for > 10 minutes with active inserts |
DelayedInserts / RejectedInserts | Write pipeline backpressure | Any sustained increase |
| Largest active part size | Determines minimum merge headroom | Free space < 2x largest active part |
Fixes
Immediate: reclaim space without dropping active data
Detach old partitions. This is the fastest way to reclaim space. Detaching moves parts out of the active set; queries no longer see them, and the space becomes available for merges to resume.
-- Replace with the target partition expression for your table
ALTER TABLE database.table DETACH PARTITION 'partition-id';
Warning: The detached data is no longer queryable. You can later remove the files from the detached/ directory to reclaim space permanently, or reattach them if needed.
Verify TTL can execute after space is freed. Once merges resume, monitor system.merges to confirm background activity is running, then check system.parts to see that expired partitions are being removed according to your TTL rules.
If a single large merge is blocked
Sometimes one massive merge is attempting to combine large parts and cannot finish due to insufficient temporary space. Check system.merges for the largest total_size_bytes_compressed. If this merge is stuck, freeing additional space or temporarily reducing insert rate may allow it to complete and break the jam.
Add capacity or tier storage
If disk is chronically under-provisioned, add storage or implement tiered storage to move older partitions to slower volumes. In tiered storage setups, remember to check all volumes. Data may reside on remote object storage while the local cache disk continues to fill and block local operations.
Reduce ingestion temporarily
Throttle or pause upstream inserts to stop creating new parts. This gives existing merges runway to complete, consolidate small parts, and reclaim space. This is a temporary bridge while you add capacity or detach old data.
Prevention
Maintain merge headroom. Keep disk usage below 80-85%. The safety margin must be at least 2x the size of the largest active part to allow a full merge to complete. In practice, keep free space above 3x the largest active part.
Monitor the merge-to-insert ratio. If parts are created faster than merges complete, debt accumulates even before disk is full. Track part count growth from system.parts against merge throughput from system.merges.
Ensure TTL policies have room to execute. TTL deletes happen during merges. If disk is always near full, TTL cannot run and old data accumulates, which paradoxically makes the disk fill faster.
Size storage for temporary duplication. Capacity planning must account for merge artifacts, mutations, and replication fetches, all of which temporarily duplicate data. Plan for more than the raw data size.
Watch unreserved_space, not just OS free space. The keep_free_space_bytes setting reduces what ClickHouse will use. unreserved_space is the real operational limit and can reach zero before the OS reports 100% utilization.
How Netdata helps
- Correlates disk utilization on ClickHouse data volumes with
RejectedInsertsandDelayedInsertsrates, exposing the spiral before inserts fail. - Tracks active part count per partition alongside merge activity, revealing stalled merges while parts accumulate.
- Alerts on
system.disksmetrics, includingunreserved_spaceapproaching zero. This precedes OS-level disk-full conditions because ClickHouse stops merges before the OS reports 100% utilization. - Surfaces ClickHouse-specific signals alongside OS disk I/O latency and throughput to distinguish merge starvation from query I/O saturation.
Related guides
- ClickHouse active part count growing: reading MaxPartCountForPartition before it pages
- ClickHouse ALTER UPDATE/DELETE overuse: why mutations are not row updates
- ClickHouse async inserts: when async_insert fixes too-many-parts and when it hides it
- ClickHouse DelayedInserts climbing: the warning before too-many-parts
- ClickHouse distributed DDL stuck: ON CLUSTER queries that never finish
- ClickHouse insert latency rising: the leading indicator of write-pipeline trouble
- ClickHouse cannot connect to ZooKeeper/Keeper: diagnosing the coordination layer
- ClickHouse Keeper latency high: the early warning before sessions expire
- ClickHouse Keeper saturation spiral: too many tables, DDL storms, and cluster freeze
- ClickHouse Memory limit (for query) exceeded: per-query limits and GROUP BY/JOIN blowups
- ClickHouse Memory limit (total) exceeded - server-wide memory pressure and fixes
- ClickHouse memory pressure death spiral: runaway queries, retries, and OOM







