Skip to content
DuoBolt

BLAKE3 in DuoBolt

Every duplicate file finder boils down to one question: are these two files byte-for-byte identical? The only rigorous way to answer that at scale is to hash every file and compare the hashes. The choice of hash function determines whether the scan takes seconds or minutes, whether it saturates modern multi-core CPUs or leaves them idle, and whether the result is mathematically trustworthy or a probabilistic guess.

DuoBolt is built on BLAKE3 — a hash function published in 2020 that outperforms MD5 and SHA-256 by factors, scales linearly with CPU cores, and remains cryptographically secure. This page explains what BLAKE3 is, why we chose it over the alternatives, and how it flows through DuoBolt’s scanning pipeline.


BLAKE3 is the fourth in the BLAKE family of cryptographic hash functions. It was published in 2020 by Jack O’Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O’Hearn. Unlike its predecessors (BLAKE, BLAKE2), BLAKE3 was designed specifically for modern hardware: multi-core CPUs, SIMD instruction sets (AVX-512, NEON), and streaming workloads.

Three properties make it the right fit for a duplicate finder:

Tree-structured

BLAKE3 splits its input into 1 KiB chunks processed in parallel across CPU cores. The chunks combine into a Merkle tree, so hashing a large file scales almost linearly with the number of available cores.

SIMD-accelerated

Hand-tuned kernels for AVX2, AVX-512 (x86), and NEON (ARM) run several hash compressions per clock cycle. Modern laptops and desktops hit multi-GB/s single-thread throughput.

Cryptographically secure

BLAKE3 provides 128-bit collision resistance and 256-bit preimage resistance — the same security level as SHA-256. Collisions are computationally infeasible for any dataset you will ever scan.


The older hashes most duplicate finders still use were designed for CPUs with one core and no SIMD. On modern hardware, the gap is stark:

AlgorithmThroughput (single core)ParallelismCryptographic status
MD5~600 MB/sNoneBroken — collisions producible in seconds
SHA-1~700 MB/sNoneBroken — SHAttered attack (2017)
SHA-256~400 MB/sNoneSecure, but slow
BLAKE2b~1 GB/sLimitedSecure
BLAKE3Several GB/sYes — multi-core and SIMDSecure

Numbers are rough steady-state throughput on a modern x86 or Apple Silicon laptop; SIMD-specific kernels widen the gap further. See the official BLAKE3 benchmarks for detailed measurements.

For a duplicate finder, the consequence is practical: scanning 1 TB of data with SHA-256 takes several times longer than with BLAKE3, and older tools cannot use all your cores regardless of how fast they are.


Why Collision-Broken Hashes Matter (Even for Dedup)

Section titled “Why Collision-Broken Hashes Matter (Even for Dedup)”

“We just want to find duplicates, who cares about cryptographic breaks?”

It matters because an adversarial file — or even an accidental one — can collide with an unrelated file under MD5 or SHA-1. If your duplicate finder uses a broken hash and deletes one of the “duplicates”, you may lose the wrong file. BLAKE3 and SHA-256 have no known feasible collision attacks, so a match means byte-identical content with overwhelming probability.

DuoBolt does not rely on a secondary byte-by-byte compare after hashing, because BLAKE3’s 128-bit collision resistance makes a false positive astronomically unlikely — orders of magnitude less likely than a cosmic ray flipping a bit in your RAM mid-comparison.


The hash is not used as a single monolithic pass. DuoBolt applies BLAKE3 in two stages to minimize I/O on large datasets:

  1. Candidate grouping by size

    Files are first bucketed by byte count. Two files of different size cannot be duplicates — no hashing needed.

  2. Head+tail prehash

    For each size-matched candidate, DuoBolt BLAKE3-hashes only the first and last N KiB. This prefilter eliminates files that share a size but differ at the edges (common for media files with different containers or encoders).

  3. Full-content BLAKE3 hash

    Only candidates that pass the prehash are fully hashed. This is where BLAKE3’s multi-core and SIMD advantages matter most — on a large video or disk image, DuoBolt saturates every available core reading and hashing the same file in parallel.

  4. Grouping and output

    Files with matching full-content hashes are grouped as duplicates. The Desktop app renders them visually; the CLI outputs JSON, CSV, or TXT.

The two-stage design means DuoBolt almost never reads a file’s full contents unnecessarily. Combined with streaming chunked I/O, this is what lets it scan a terabyte of NAS data in under 90 seconds — see the NAS guide and benchmark results.

You can toggle the prefilter off with --no-prehash in the CLI if you want a single-pass full hash, but accuracy is identical either way — the prehash only changes when work happens, not what work.


Duplicate finders split cleanly on hash choice, and the downstream performance follows:

  • Czkawka also uses BLAKE3 — the closest architectural peer. DuoBolt’s 1.3-2.3× advantage over Czkawka on our benchmark suite comes from per-root parallelism and streaming I/O, not the hash itself.
  • dupeGuru uses MD5/SHA-256 single-threaded. DuoBolt is up to 21× faster on real datasets and completes scans where dupeGuru times out.
  • Gemini 2 and most macOS duplicate finders rely on SHA-256 or proprietary algorithms with no public benchmarks.

Hash algorithm alone does not make a duplicate finder fast — architecture matters — but a slow hash is a hard ceiling. Any tool built on MD5 or SHA-256 inherits their limits no matter how clever the code around them is.