ClickHouse checksum mismatch and broken parts: detecting data corruption
ClickHouse logs showing Checksum doesn't match, Broken part, or similar errors indicate data corruption. Affected parts move to system.detached_parts. Queries may throw exceptions or return partial results. On replicated clusters, a replica with corrupt parts may lag because it cannot validate fetched parts. Corruption does not self-resolve. You must quarantine the bad part, identify the root cause, and rebuild the data from a healthy source or backup.
ClickHouse stores every data part with checksum files covering column data, index files, and metadata. When the server reads a part during a query, merge, or replication fetch, it recomputes checksums and compares them against the stored values. A mismatch means the bytes on disk no longer match what was written. The server detaches the part, moving it out of the active dataset. If the table is replicated, the replica may attempt to re-fetch the part from a peer. If no healthy copy exists, the data is effectively lost. On standalone nodes, the only recovery path is external restore or re-ingestion.
flowchart TD
A[Checksum mismatch detected] --> B{Replicated table?}
B -->|Yes| C[Check peer replicas
for healthy copy]
B -->|No| D[Check hardware
and filesystem]
C --> E{Healthy replica
has part?}
E -->|Yes| F[Remove corrupt part
Trigger re-fetch]
E -->|No| G[Data loss confirmed
Restore from backup]
D --> H[Run CHECK TABLE]
H --> I{Hardware or
filesystem errors?}
I -->|Yes| J[Replace failing component]
I -->|No| K[Restore or re-ingest
the dataset]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Failing disk hardware | Checksum errors clustered on one volume; kernel logs show storage errors | dmesg for disk or controller errors |
| Filesystem corruption | Errors after unclean shutdown or on a specific mount point; multiple unrelated parts affected | OS logs for filesystem metadata errors |
| Bad RAM | Random corruption across unrelated tables and partitions; no disk errors | dmesg for memory ECC errors |
| Incomplete fetch during replication | system.replication_queue shows last_exception referencing fetch or checksum failure on the receiving replica | system.replication_queue on the target replica |
| Software bug in merge or mutation | Corruption appears immediately after a merge or mutation completes; affected part name matches the operation | system.part_log for recent MergeParts or MutatePart events on the affected part |
Quick checks
Run these read-only checks before any destructive action.
# Check server logs for corruption indicators
grep -Ei 'checksum|corrupt|Broken part|Cannot read all data|Mismatch' /var/log/clickhouse-server/*.log | tail -100
-- Inspect detached parts and the reasons they were removed
SELECT database, table, name, reason, bytes_on_disk, modification_time
FROM system.detached_parts
ORDER BY modification_time DESC;
-- Check replication queue for stuck fetch or checksum failures
SELECT database, table, type, num_tries, last_exception
FROM system.replication_queue
WHERE num_tries > 0
ORDER BY num_tries DESC;
-- Check for integrity-related event counters
SELECT event, value
FROM system.events
WHERE event IN ('ReplicatedPartChecksFailed', 'ReplicatedDataLoss');
-- Assess blast radius: active parts in the affected table
SELECT partition_id, count() AS active_parts, formatReadableSize(sum(bytes_on_disk)) AS size
FROM system.parts
WHERE database = 'your_db' AND table = 'your_table' AND active = 1
GROUP BY partition_id
ORDER BY active_parts DESC;
-- Proactively verify a table that has not yet shown errors
CHECK TABLE your_db.your_table;
How to diagnose it
Confirm the error pattern. Search the server error log for
Checksum,corrupt,Broken part,Cannot read all data, orMismatch. Note the timestamp, table, and part name. If the error repeats on the same part, suspect a localized disk or fetch issue. Random parts across tables suggest memory or filesystem corruption.Map detached parts to tables. Query
system.detached_partsand examine thereasoncolumn. Reasons such asbroken-on-start,checksum-mismatch, or fetch-related errors confirm automatic detachment due to integrity failure. Notemodification_timeto see if the detachment correlates with a recent restart, merge, or replication event.Determine if the table is replicated. For
ReplicatedMergeTreetables, checksystem.replicas. If the replica is healthy, a peer may still hold a valid copy. If the table is not replicated, treat the incident as potential data loss.Inspect the replication queue. Query
system.replication_queuefor entries with highnum_triesand non-emptylast_exception. If the exception references a checksum mismatch or failed fetch, corruption is blocking convergence. Check the exception message for the source replica and verify that peer’s health before forcing a re-fetch.Check hardware and OS logs. Review
dmesgfor disk controller errors, I/O failures, or memory ECC errors around the time the corruption first appeared. If hardware errors are present, stop using the node for production data until the component is replaced.Correlate with background operations. Query
system.part_logforMergePartsorMutatePartevents involving the affected part. Noteevent_timeandduration_ms. An extremely long merge that aborted, or a mutation with a large duration and no subsequent successful operation on the same partition, is suspicious. While rare, software bugs in specific ClickHouse versions can produce invalid output during large merges or mutations.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
system.detached_parts count | Direct indicator of parts removed due to integrity failures | Unexpected growth in count or bytes |
ReplicatedPartChecksFailed | Integrity failures detected during replication verification | Sustained non-zero rate |
ReplicatedDataLoss | Confirmed unrecoverable data loss on a replicated table | Any non-zero value |
system.replication_queue.last_exception | Shows whether replication is stuck due to corrupt source parts | Entries with num_tries increasing and checksum-related exceptions |
| Checksum errors in server logs | Earliest detectable sign of hardware or filesystem degradation | New log lines matching Checksum or Mismatch |
CHECK TABLE result | Proactive integrity scan that surfaces mismatches before they hit queries | Result shows non-empty list of corrupted parts |
Fixes
Replicated tables: force a re-fetch
If another replica has a healthy copy, remove the corrupted part from detached/ and let the replication queue fetch a replacement. First, fix the root cause (failing disk, bad RAM, filesystem damage) to prevent re-corruption. Query system.replicas to confirm a healthy peer holds the part.
If the part is already in system.detached_parts, it is inactive. Remove the directory from the table’s detached/ path. The replica should schedule a fetch automatically. If the queue stalls, SYSTEM RESTART REPLICA db.table may unblock replication.
For a replica with widespread corruption or metadata inconsistency, rebuilding the replica is often faster than incremental repair. Drop and re-create the table, or use SYSTEM DROP REPLICA if the local replica metadata is out of sync with ZooKeeper, to force a full re-sync. This destroys all local data for that table and generates significant network load.
Standalone tables: restore or re-ingest
For non-replicated tables, there is no peer to heal from. Restore the detached part from backup or re-ingest from upstream. If the part is outside your retention window, you may drop it and accept the gap.
Warning: Dropping a detached part is destructive and irreversible. Before dropping anything, confirm the scope by comparing row counts or partition-level aggregates against your upstream source of truth.
Hardware and filesystem faults: fix the substrate first
Corruption caused by failing SSDs, bad RAM, or filesystem damage will recur until the underlying layer is repaired. Replace failing hardware. If corruption appeared after an unclean shutdown, run a filesystem consistency check on the ClickHouse data volume before returning the node to production. Do not run CHECK TABLE or restore data onto a known-bad disk.
Clean up detached parts
Detached parts remain under the table’s data path in detached/ and consume disk. Once the active dataset is verified healthy and you no longer need the files for forensics, remove them to reclaim space.
Prevention
- Schedule
CHECK TABLEon large or critical tables during low-traffic windows. It verifies checksums without blocking reads or writes. - Alert on growth in
system.detached_partscount or size. Detachment is almost always a reaction to an integrity failure. - Correlate ClickHouse corruption signals with OS-level disk SMART alerts,
dmesgerrors, and memory ECC logs. - Replicate critical data. Standalone nodes have no self-healing path for corrupted parts.
- Inspect replication queues. A replica with stuck fetch entries and rising
num_triesmay be repeatedly downloading a corrupted source part. Catch this before it cascades. - Patch known merge and mutation bugs promptly. Corruption defects in ClickHouse are typically fixed quickly once identified.
How Netdata helps
- Alert on server log lines matching
Checksum,Broken part,Mismatch, andCannot read all data. - Correlate
ReplicatedPartChecksFailedandReplicatedDataLosscounter spikes with disk I/O error metrics and memory ECC events to distinguish hardware faults from replication bugs. - Chart the count and bytes of
system.detached_partsover time to spot gradual integrity degradation that logs alone might miss. - Monitor replication queue depth and exception rates alongside checksum signals to identify whether corruption is blocking convergence.
- Track part-count stability and
CHECK TABLEexecution so deviations from normal disk-structure health are visible before queries fail.
Related guides
- ClickHouse active part count growing: reading MaxPartCountForPartition before it pages
- ClickHouse ALTER UPDATE/DELETE overuse: why mutations are not row updates
- ClickHouse async inserts: when async_insert fixes too-many-parts and when it hides it
- ClickHouse background pool saturation: when merges and mutations starve
- ClickHouse mark cache and uncompressed cache: reading low hit rates
- ClickHouse client connections climbing: TCP 9000, HTTP 8123, and connection leaks
- ClickHouse DelayedInserts climbing: the warning before too-many-parts
- ClickHouse detached parts piling up: reading system.detached_parts and reclaiming space
- ClickHouse disk space collapse: why merges need free space and how the spiral starts
- ClickHouse disk space monitoring: free_space, unreserved_space, and the 80% target
- ClickHouse distributed DDL stuck: ON CLUSTER queries that never finish
- ClickHouse distributed query amplification: one coordinator, many shard subqueries







