$ guides / clickhouse / clickhouse-ttl-not-deleting-data ▌

Operations Guides

ClickHouse TTL not deleting data: why expired rows survive and the disk fills

A MergeTree table with a configured TTL can still accumulate disk usage when expired rows remain visible to SELECT. This is not a syntax or timezone error. ClickHouse enforces TTL only as a side effect of background merges. When merges stop or never start, TTL stops, and storage grows without warning.

What this means

ClickHouse evaluates TTL expressions only during merge operations. When the merge scheduler combines parts, it checks whether rows exceeded their TTL interval. Depending on table settings, it either drops the entire part if every row expired, or rewrites the part to exclude expired rows. Both paths require a merge. If the background merge pool is saturated, disk space is too low for temporary merge output, or a partition has no new inserts to trigger merge selection, expired rows remain indefinitely. The system does not log a warning for TTL skips; the only visible symptom is growing storage.

flowchart TD
    A[TTL rule expires rows] --> B{Merge scheduled?}
    B -->|No new inserts| C[Partition cold merges idle]
    B -->|Mutations or disk full| D[Merge starved]
    C --> E[Expired rows persist]
    D --> E
    E --> F[Disk fills silently]

Common causes

Cause	What it looks like	First thing to check
Merge starvation from part explosion	`system.merges` is empty or sluggish, `DelayedInserts` rises, and part counts climb	`SELECT count() FROM system.merges` and per-partition part counts
Mutations monopolizing the merge pool	`system.mutations` shows `is_done = 0` and `parts_to_do` is flat	`SELECT database, table, parts_to_do FROM system.mutations WHERE is_done = 0`
Disk too full for temporary merge output	`system.disks.unreserved_space` near zero and no merges run	`SELECT name, unreserved_space FROM system.disks`
Cold partition with no insert traffic	Old partition has many parts, no recent inserts, and TTL is never evaluated	`SELECT partition_id, count() FROM system.parts WHERE active = 1 GROUP BY partition_id`
`ttl_only_drop_parts` enabled on mixed data	Parts contain both expired and live rows, so whole-part drop is blocked	`SHOW CREATE TABLE` to verify the setting
TTL altered after data ingestion	Historical parts keep the old TTL rule; the new rule applies only to new merges	Check whether the TTL expression was added after initial data load

Quick checks

-- Active merges and mutations
SELECT count() AS active_merges,
       countIf(is_mutation = 1) AS active_mutations
FROM system.merges;

-- Mutation backlog blocking merges
SELECT database, table, mutation_id, parts_to_do
FROM system.mutations
WHERE is_done = 0
ORDER BY parts_to_do DESC;

-- Disk headroom for merge output
SELECT name,
       formatReadableSize(free_space) AS free,
       formatReadableSize(unreserved_space) AS unreserved
FROM system.disks;

-- Part counts by partition
SELECT database, table, partition_id,
       count() AS active_parts,
       formatReadableSize(sum(bytes_on_disk)) AS size
FROM system.parts
WHERE active = 1
GROUP BY database, table, partition_id
ORDER BY active_parts DESC
LIMIT 20;

-- Throttled or rejected inserts
SELECT event, value
FROM system.events
WHERE event IN ('DelayedInserts', 'RejectedInserts');

-- Worst-case part concentration
SELECT value AS max_parts_per_partition
FROM system.asynchronous_metrics
WHERE metric = 'MaxPartCountForPartition';

# Recent merge and TTL entries in server logs
grep -Ei 'ttl|merge' /var/log/clickhouse-server/*.log | tail -50

How to diagnose it

Verify the TTL expression. Run SHOW CREATE TABLE db.table. Confirm the TTL clause references the correct column and interval. A typo means rows never expire.
Confirm merges are executing. Query system.merges. An empty result while part counts are high indicates merge starvation. If only is_mutation = 1 entries appear, mutations have crowded out regular merges.
Check for stalled mutations. Query system.mutations WHERE is_done = 0. If parts_to_do is not decreasing, the mutation is stuck and monopolizing a merge thread.
Inspect disk headroom. Query system.disks. If unreserved_space is near zero, ClickHouse halts merges to avoid running out of room for temporary output. Without merges, TTL cannot run.
Identify cold partitions. Query system.parts grouped by partition_id. If old partitions have many active parts but receive no new inserts, the merge selector has no reason to schedule work there. Expired data persists until something triggers a merge.
Review ttl_only_drop_parts. If enabled, ClickHouse drops an entire part only when every row inside it expired. Parts with mixed expired and live rows are left untouched. Verify whether your retention pattern matches whole-part expiry.
Check for post-load TTL changes. If the TTL rule was added or modified after ingestion, existing parts may keep the old expiration logic. They adopt the new rule only after a merge rewrites them.
Measure background pool saturation. Query system.metrics for BackgroundMergesAndMutationsPoolTask. If active tasks are near the pool size, merges are queued but not executing, which delays TTL.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
Active parts per partition	High part counts signal merge backlog, which blocks TTL	Any partition with parts steadily growing toward the insert-delay threshold
Merge activity (`system.merges`)	Direct proof that TTL has an opportunity to run	Zero regular merges for more than 10 minutes while parts remain above baseline
Mutation queue (`system.mutations`)	Mutations consume the same pool as merges	`is_done = 0` with `parts_to_do` flat for over 30 minutes
Disk `unreserved_space`	Merges need temporary space	`unreserved_space` below twice the size of the largest active part
Background pool utilization	Saturated pools delay merge scheduling	`BackgroundMergesAndMutationsPoolTask` near the pool size for over 15 minutes
`DelayedInserts` / `RejectedInserts`	Part pressure that precedes merge collapse	`RejectedInserts` increasing, or `DelayedInserts` trending up
`MaxPartCountForPartition`	Server-wide worst-case concentration	Value climbing steadily over a 30-minute window

Fixes

Clear mutations blocking the merge pool

If system.mutations shows stuck mutations, kill non-critical ones to free merge threads.

KILL MUTATION WHERE database = 'db' AND table = 'table' AND mutation_id = '0000000000';

Tradeoff: Killed mutations leave partially rewritten parts in place. If you still need the change, reissue the mutation after recovery.

Reclaim disk space to resume merges

When unreserved_space nears zero, merges halt. Detach old or unneeded partitions to reclaim space immediately, or add capacity.

ALTER TABLE db.table DETACH PARTITION '2023-01-01';

Tradeoff: Detached partitions are invisible to queries until reattached. Use only for data you can temporarily exclude.

Trigger activity on cold partitions

If a partition has no new inserts, the merge scheduler may ignore it. Insert a small, valid batch into the partition to give the merge selector a reason to schedule work and evaluate TTL.

Tradeoff: This generates a new part. Ensure the system has enough merge capacity to handle it without worsening part count pressure.

Adjust `ttl_only_drop_parts` to match retention needs

If enabled and parts contain interleaved expired and live rows, ClickHouse cannot drop those parts. Disable it only if you need row-level TTL precision, but expect higher I/O from rewrite merges.

Tradeoff: Row-level TTL rewrites parts on disk, consuming CPU and I/O. Whole-part drops are cheap but require partitioning and TTL alignment.

Re-materialize after TTL rule changes

If the TTL was modified after ingestion, plan a window for background merges to rewrite historical parts, or accept that existing data expires under the old rule. Do not repeatedly ALTER TTL; each change queues new background work.

Prevention

Monitor part count per partition, not just per table. Merge and TTL crises happen at the partition level.
Keep disk usage below 80% and ensure unreserved_space stays above twice the size of the largest active part so merges have room to write output.
Avoid heavy ALTER UPDATE/DELETE during peak hours. Mutations starve merges and freeze TTL.
Batch inserts to keep part creation rate low. High part counts raise merge cost and delay TTL.
Align partition granularity with TTL intervals. If you TTL by month but partition by day, ttl_only_drop_parts may rarely fire because each part spans a shorter interval than the TTL.

How Netdata helps

Correlates MaxPartCountForPartition with OS-level disk I/O latency to expose merge starvation before TTL stalls.
Tracks MemoryTracking and system memory to identify memory pressure that pushes background merges into swap or cancellation.
Surfaces disk space from system.disks alongside OS volume metrics, highlighting when unreserved_space drops below safe merge headroom.
Charts insert latency and RejectedInserts spikes so you catch the merge debt spiral before parts accumulate enough to block TTL.
Distinguishes coordination problems (ZooKeeper/Keeper latency) from local merge problems by cross-referencing replication queue depth with local merge activity.

The Netdata solution

ClickHouse monitoring with Netdata

Netdata monitors ClickHouse with per-second metrics and ML anomaly detection. Track merge debt, memory usage, replication lag, Keeper/ZooKeeper saturation, and disk headroom against the host signals that drive them.

See ClickHouse monitoring → Start monitoring free

ClickHouse TTL not deleting data: why expired rows survive and the disk fills

ClickHouse TTL not deleting data: why expired rows survive and the disk fills

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Clear mutations blocking the merge pool

Reclaim disk space to resume merges

Trigger activity on cold partitions

Adjust ttl_only_drop_parts to match retention needs

Re-materialize after TTL rule changes

Prevention

How Netdata helps

Related guides

ClickHouse monitoring with Netdata

Adjust `ttl_only_drop_parts` to match retention needs