ClickHouse projections and hidden parts: the part count you can’t see

Inserts fail with TOO_MANY_PARTS while system.parts on the base table looks comfortable. If you monitor only the source table, you are missing projection sub-parts inside base directories and independent parts in materialized view target tables. A single insert can spawn parts across projection subdirectories and multiple downstream tables. The number in system.parts is a floor, not a ceiling. The gap between visible and real part count is where merge crises begin.

MergeTree tables enforce per-partition part limits through parts_to_delay_insert and parts_to_throw_insert. The MaxPartCountForPartition metric tracks the worst partition across every MergeTree table on the server. Most operators query system.parts to gauge headroom, but that table counts only base table parts. Projection parts are tracked separately, in system.projection_parts. Materialized view target parts appear in system.parts for those target tables, which operators often overlook. When base, projection, and MV parts accumulate together, a partition can breach its limit while the source table appears healthy.

What it is and why it matters

ClickHouse performance degrades as part count increases. More parts mean more file descriptors, more index lookups, and heavier merge overhead. The standard monitoring query looks like this:

SELECT table, partition, count() AS parts
FROM system.parts
WHERE active AND database = 'mydb'
GROUP BY table, partition
HAVING parts > 10
ORDER BY parts DESC;

This query misses projection sub-parts stored inside base table directories and ignores materialized view target tables entirely. A table with three projections adds roughly 3x the hidden part count to the global merge load. A source table with three materialized views means one insert produces parts across four independent tables. Each has its own per-partition limit, and all compete for background merge threads.

Sizing merge capacity, disk headroom, and alert thresholds on base table part count alone under-provisions for the real workload.

How it works

Projections are materialized secondary structures stored inside the base table’s part directories. Each physical base part contains subdirectories for every projection defined on the table. In system.parts, the projections column lists projection names as an Array(String), but it does not expose the projection part’s size, row count, or activity status. Actual projection parts are tracked in system.projection_parts, joined to the base part through the parent_name column.

Projection parts merge synchronously with their parent base parts. When two base parts merge, their projection sub-parts merge simultaneously. A merge that looks routine in system.merges runs additional merge passes behind the scenes for every projection.

Materialized views are insert triggers, not background pollers. Each insert to the source table fires every attached MV in sequence, producing independent writes to target tables. A single insert touching N MVs generates N additional part-creation events across N target tables. If MVs cascade, the multiplication continues downstream. Unlike projections, MV target tables are independent MergeTree tables with their own merge schedules, TTL rules, and part counts.

flowchart LR
    Insert[INSERT to source table] --> Base[Base table part]
    Insert --> MV1[MV1 target part]
    Insert --> MV2[MV2 target part]
    Base --> Proj1[Projection sub-part]
    Base --> Proj2[Projection sub-part]
    Base --> MaxCount[MaxPartCountForPartition]
    Proj1 --> MaxCount
    MV1 --> MaxCount
    MV2 --> MaxCount
    MaxCount --> Limit[Per-partition limit]
    Limit --> Reject[TOO_MANY_PARTS]

To see projection parts alongside base parts, join system.parts with system.projection_parts:

SELECT
    p.database,
    p.table,
    p.name AS base_part_name,
    p.partition_id,
    pp.name AS projection_part_name,
    pp.active
FROM system.parts AS p
LEFT JOIN system.projection_parts AS pp
    ON p.database = pp.database
    AND p.table = pp.table
    AND p.name = pp.parent_name
WHERE p.database = 'mydb'
  AND p.table = 'mytable'
  AND p.active
ORDER BY p.database, p.table, p.name;

This reveals how many projection sub-parts exist per base part. Because projection parts merge synchronously with the base, heavy projection usage produces longer merge durations and higher background pool utilization than the base part count alone suggests.

Where it shows up in production

Merge death spiral starts earlier than expected. Multiple projections add hidden parts that count toward MaxPartCountForPartition. The base table may show 80 active parts while the true per-partition count is far higher. By the time the base count looks alarming, the real count may already be critical and merges may be falling behind.

Materialized view targets overwhelm shared merge pools. Each MV target table needs independent background merge capacity. Operators often size the merge pool for the source table ingest rate, ignoring the combined load of all targets. A stable source table can coexist with MV targets racing toward their part limits.

Backfills leave projection gaps. After ALTER TABLE ... MATERIALIZE PROJECTION, old base parts lack projection data until they are rewritten by a merge or mutation. system.projection_parts then shows fewer rows than system.parts for the same parent, and queries that would hit the projection instead fall back to reading the full base table until the next merge rewrites those parts.

Small insert batches amplify the fixed cost. ClickHouse prefers batches of at least a few thousand rows. When small inserts hit a table with multiple MVs or projections, the per-part overhead multiplies across every derived structure. Each batch creates one base part plus one part per projection and one part per MV target, rapidly exhausting the part budget.

Tradeoffs and common misuses

  • Projections speed up reads but inflate write and merge cost. Every base merge must also merge projection sub-parts synchronously. On write-heavy workloads with unpredictable query patterns, merge overhead can outweigh read benefits. If query patterns are stable and the projection eliminates large table scans, the tradeoff is usually worth it.
  • Materialized views offer independent target schemas and TTLs, but they multiply part pressure. Use them when target tables need different sorting keys or retention policies. Do not use them as a filtered view of the same data; projections avoid independent merge queues.
  • Monitoring only the source table misses the real limit. If you have MVs, monitor part counts and merge activity on every target table. The source table can remain stable while a target table approaches TOO_MANY_PARTS.
  • Using many projections on high-frequency insert tables. Each projection adds a synchronous merge passenger to every base merge. On tables with dozens of projections, a single large merge can monopolize a background thread for hours and block other merges.

Signals to watch in production

SignalWhy it mattersWarning sign
MaxPartCountForPartitionReflects the worst partition across base tables, projections, and MV targets combinedValue sustained above 50% of parts_to_delay_insert
Per-table active parts for MV targetsEach MV target is an independent MergeTree table with its own limitTarget table part count growing while source table is stable
system.merges elapsed timeBase merges carry synchronous projection merge overheadMerge duration increasing out of proportion to base part size
system.parts vs system.projection_partssystem.parts misses projection sub-partsBase table has many active parts but projection parts are missing or stale after a backfill
RejectedInserts / DelayedInsertsHard evidence that a limit has been breachedAny sustained increase in rejections

How Netdata helps

  • Correlate MaxPartCountForPartition with per-table part counts across base tables and MV targets to spot hidden amplification before limits are breached.
  • Alert on part count growth rate and merge duration trends when projection or MV overhead is exhausting background capacity.
  • Track RejectedInserts and DelayedInserts counters to catch active throttling from invisible part accumulation.
  • Cross-reference disk I/O and background pool saturation to determine whether merge slowdown is from projection overhead or MV target overload.
  • Monitor replication queue depth when projections or MVs on replicated tables increase fetch and merge pressure across replicas.