ClickHouse system log tables eating disk: query_log, part_log, and TTL

You are investigating disk growth on a ClickHouse node. User tables look reasonable, but system.query_log or system.part_log are consuming tens or hundreds of gigabytes. These tables use MergeTree-family engines and are subject to the same part-count and merge pressure limits as production tables. Without a TTL rule, data accumulates forever. During incidents with high query error rates or retry storms, query_log can expand fast enough to threaten disk capacity and stall logging.

ClickHouse flushes telemetry in batches to system.query_log, system.part_log, system.trace_log, and system.text_log. Each flush creates a data part. Without TTL, old parts are never dropped. On high-QPS clusters, frequent flushes create many parts, and every failed query is logged, so error spikes accelerate growth. Eventually the system tables become the largest disk consumers. If part counts cross parts_to_delay_insert or parts_to_throw_insert, new log inserts are delayed or rejected.

What this means

System log tables behave exactly like MergeTree tables: inserts create active parts, background merges consolidate them, and part-count limits apply. No TTL means unbounded disk growth.

The risk is not just space. When system.query_log accumulates merge debt, it competes with production tables for background pool threads and disk I/O. If it crosses its part limit, new log entries are rejected. You lose observability exactly when you need it.

Common causes

CauseWhat it looks likeFirst thing to check
No TTL on system log tablessystem.query_log or system.part_log rank among top disk consumers; oldest parts span monthsSELECT ... FROM system.parts WHERE database = 'system' for size and part counts
High-error incidentquery_log disk usage spikes after an outage; error rate correlates with growthFailedQuery in system.events around the incident window
High merge or insert activitypart_log is large on write-heavy clusters with frequent merges or mutationsMerge throughput and part_log event volume
Merge starvation on system tablessystem.query_log part count climbs toward parts_to_throw_insertActive parts per partition for system tables

Quick checks

-- System tables consuming disk
SELECT
    database,
    table,
    formatReadableSize(sum(bytes_on_disk)) AS size,
    count() AS active_parts
FROM system.parts
WHERE active = 1 AND database = 'system'
GROUP BY database, table
ORDER BY size DESC;
-- Per-partition part counts for system tables
SELECT
    table,
    partition_id,
    count() AS parts_in_partition
FROM system.parts
WHERE active = 1 AND database = 'system'
GROUP BY table, partition_id
ORDER BY parts_in_partition DESC
LIMIT 10;
-- Query log volume over the last 24 hours
SELECT
    toStartOfHour(event_time) AS hour,
    count() AS queries,
    countIf(type = 'ExceptionWhileProcessing') AS errors
FROM system.query_log
WHERE event_time > now() - INTERVAL 24 HOUR
GROUP BY hour
ORDER BY hour DESC;
-- Inspect engine definition for TTL and partitioning
SELECT engine_full FROM system.tables WHERE database = 'system' AND name = 'query_log';
-- Insert limits hit on any table
SELECT event, value
FROM system.events
WHERE event IN ('RejectedInserts', 'DelayedInserts');
-- Active merges on system tables
SELECT
    database,
    table,
    elapsed,
    progress
FROM system.merges
WHERE database = 'system';

How to diagnose it

  1. Measure system table footprint. Run the system.parts size query. If system.query_log, system.part_log, trace_log, or text_log exceed your retention budget, they are the primary consumers.
  2. Check part density. High active part counts in system.query_log indicate merge pressure.
  3. Correlate with workload. A spike in ExceptionWhileProcessing or overall query volume explains sudden query_log growth.
  4. Verify TTL. SELECT engine_full FROM system.tables or SHOW CREATE TABLE system.query_log reveals whether a TTL clause exists. Absence confirms unbounded growth.
  5. Assess merge health. Query system.merges for the system database. If merges are scarce while parts grow, the table will hit limits regardless of TTL.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
System table disk size (system.parts.bytes_on_disk)Direct measure of unbounded growthAny system table larger than your largest production table
Active parts in system.query_logPredicts too-many-parts on the log table itselfPart count trending toward your parts_to_delay_insert threshold
Failed query rateFailed queries are still logged; errors accelerate query_log growthSudden spike in ExceptionWhileProcessing entries
system.part_log sizeHigh merge activity generates many part eventsSteady growth on write-heavy clusters
DelayedInserts / RejectedInsertsSystem tables can trigger these limits internallyNonzero deltas with system tables near part limits

Fixes

Configure TTL on system tables

Set a TTL expression on system.query_log, system.part_log, system.trace_log, and system.text_log. Default installations usually partition these tables by toYYYYMMDD(event_date), so a TTL on event_date drops entire partitions efficiently during merges.

For an existing table, apply TTL with ALTER TABLE:

ALTER TABLE system.query_log MODIFY TTL event_date + INTERVAL 7 DAY;

For tables created from server configuration, set the <ttl> element inside the log table’s configuration block:

<query_log>
    <ttl>event_date + INTERVAL 30 DAY</ttl>
</query_log>

Configuration changes do not alter tables that already exist. Apply ALTER TABLE to existing system tables, or drop and recreate the table after changing its configuration.

Reclaim space from an overgrown system table

TTL runs during merges. If a system table already has hundreds of active parts and merges are starved, TTL alone may not reclaim space quickly.

For immediate relief, drop old partitions directly after verifying partition IDs in system.parts:

ALTER TABLE system.query_log DROP PARTITION ID '20240601';

Warning: this permanently deletes data for that partition. Verify the partition ID and backup any critical forensic data first.

If merges are proceeding but too slowly, reduce incoming query volume or error rates to slow part creation while TTL catches up.

Balance retention against forensics

Shorter TTL saves disk and reduces merge debt. Longer TTL aids incident investigation. query_log is usually the most valuable retention target. trace_log and text_log are often the largest relative to their utility, so consider shorter retention for verbose logs.

Ensure time-based partitioning

Default system log tables are partitioned by toYYYYMMDD(event_date). If the engine definition shows a non-time partition key, TTL drops become expensive because they must delete row ranges inside parts instead of dropping whole partitions. Keep time-based partitioning when customizing system log engine definitions.

Prevention

  • Set TTL on all system MergeTree tables before taking production traffic.
  • Include system.query_log and system.part_log in disk capacity planning.
  • Monitor the system database size with the same urgency as production databases.
  • During incident response, expect query_log to expand rapidly and verify it has disk and part-count headroom.

How Netdata helps

  • Correlates ClickHouse disk usage with table-level part counts to surface when system.query_log or system.part_log become top consumers.
  • Tracks RejectedInserts and DelayedInserts to catch system tables approaching part limits.
  • Monitors query error rates to warn that query_log is about to balloon.
  • Visualizes active part counts per table, including system tables, to spot merge debt before it blocks inserts.
  • Alerts on disk space utilization with awareness of merge headroom requirements.