DuoBolt for NAS

A NAS accumulates duplicates faster than any other storage you own. Years of phone backups, photo libraries synced from three devices, a spouse’s media library, a second copy “just in case”, a torrent that overlapped with a streaming rip — it all stacks up on the same volume and grows invisibly for years.

This page covers how DuoBolt handles NAS scanning specifically: why most duplicate finders fail on network storage, how DuoBolt’s engine was built around the problem, and a step-by-step guide for scanning Synology, QNAP, TrueNAS, Unraid, and any SMB/NFS share.

Why NAS Is the Hardest Place to Find Duplicates

Scanning a NAS is fundamentally different from scanning a local SSD. Three things conspire against most duplicate finders:

Cold, spun-down disks

A power-saving NAS parks its HDDs after idle minutes. First-access latency jumps from milliseconds to seconds as each drive spins up. Tools that read files sequentially stall on every new directory.

Network I/O latency

Every file open, stat, and read is a round-trip over Ethernet. A local scan doing 50,000 files/sec might drop to 500/sec over SMB. Legacy tools never see saturation — they’re always waiting on the wire.

Single-threaded hashing

Most duplicate finders still hash one file at a time with MD5 or SHA-256. On a modern 8-core machine scanning a NAS, seven cores sit idle while the eighth waits on a slow network read.

The result: older tools routinely time out on terabyte-scale NAS volumes. In our benchmark suite, both dupeGuru and Gemini 2 exceeded the 15-minute cutoff on a 1 TB Synology snapshot. Czkawka completed, but took 85 seconds cold where DuoBolt took 64.

How DuoBolt’s Engine Handles NAS

DuoBolt was designed from day one with NAS and terabyte-scale storage in mind. Four architectural decisions compound to deliver the speed advantage:

Per-root parallelism

Multiple scan roots run in parallel instead of sequentially. If you point DuoBolt at /Volumes/NAS/photos and /Volumes/NAS/videos, both walk concurrently, so the bottleneck becomes disk and network throughput — not tool pacing.
Streaming chunked I/O

Reads are chunked and streamed, overlapping disk access with hashing. While BLAKE3 processes one chunk, the next is already on the wire. CPUs stay busy; the network is never the limiter waiting for a free core.
Multi-core BLAKE3 hashing

BLAKE3 is tree-structured and multi-core by design. On an M1 Pro scanning a NAS, DuoBolt uses all performance cores in parallel to process chunks, saturating whatever I/O the NAS can deliver.
Head+tail prehash

Before any full-content hash, DuoBolt hashes just the first and last N KiB of each candidate file. Non-matches are eliminated from the candidate pool before touching the rest of the file — a massive I/O saving on large media files that look similar by size but differ at the edges.

Benchmark: 1 TB Synology DS920+

Cold scan — 64 seconds

1.01 TB volume, HDDs spun down before the test. Mix of 605 GB video, 246 GB photos, 162 GB music. DuoBolt discovers, prehashes, full-hashes, and groups everything in just over a minute.

Warm scan — 20 seconds

Same volume, caches warmed. DuoBolt’s cache layer short-circuits unchanged files, pushing the scan under 20 seconds on subsequent runs.

For side-by-side comparison with Czkawka, dupeGuru, and Gemini 2 on the same hardware, see the full benchmark results.

Cache Layer — Why Warm Scans Finish in Seconds

The difference between a 64-second cold scan and a 20-second warm scan is not magic — it is DuoBolt’s cache layer working as designed. When DuoBolt hashes a file, it records the full-content BLAKE3 hash keyed by path, size, and modification time. On subsequent scans, files whose (size, mtime) signature is unchanged skip hashing entirely and pull their hash from the cache.

For a NAS this matters more than for any other target. Every hash retrieved from cache is one file DuoBolt does not read over the network — the single biggest win on a medium dominated by SMB or NFS latency.

Desktop
CLI

Configure the cache under Settings → Cache. Toggle the hash cache on or off, set separate minimum sizes for local and network volumes, and inspect live statistics (requests, hits, misses, hit rate) after each scan:

DuoBolt Desktop Settings — Cache panel with hash cache toggle, local and network cache minimum size sliders, and live cache statistics showing cached entries and hit rate

The Cache Minimum Size (Network) slider is the Desktop equivalent of --cache-min-size-network — lower it to cache smaller files on slow NAS shares, raise it to keep the cache lean. Cache Statistics lets you validate that warm scans are actually hitting the cache rather than re-hashing unchanged files.

The cache is tunable specifically for network volumes:

--cache-min-size-network=SIZE — only cache files larger than SIZE on network volumes, avoiding bloat from tens of thousands of tiny sidecar files.
--cache-min-size-local=SIZE — separate threshold for local disks, typically set lower.
--cache-max-bytes=SIZE / --cache-max-entries=N — hard ceilings when scanning multi-terabyte datasets.
--cache-stats — inspect hit ratio after a scan to validate warm-up effectiveness.
--cache-clear — full invalidation (rarely needed; DuoBolt auto-evicts stale entries when mtime changes).

See the CLI reference for the complete flag list.

Mount the share on your machine
- macOS: Finder → Cmd+K → smb://your-nas.local/volume (or afp://, nfs://)
- Windows: File Explorer → right-click “This PC” → Map Network Drive
Open DuoBolt and click Add Folder

Navigate to the mounted share and select the directory you want to deduplicate (e.g., /Volumes/NAS/photos).
Configure filters (optional but recommended)
- Min size: set to 1 MiB or higher to skip thumbnails and metadata files
- Ignore hidden files and Ignore system files on
- Exclude directory extensions: .photoslibrary, .aplibrary, .tmbundle (macOS managed bundles)
Run the scan

First scan is “cold” — expect the times shown in our benchmark. Subsequent runs benefit from DuoBolt’s cache layer.
Review before deleting

Files deleted from a NAS bypass the system Trash — they are permanently removed. This is an OS limitation on all network file systems, not a DuoBolt choice. Always export a CSV/JSON report first and double-check before confirming deletion.

Mount the share first (see Desktop tab), then run:

duobolt-cli /Volumes/NAS/media \
  --min-size=1M \
  --ignore-system-files \
  --ignore-hidden-files \
  --output=json \
  --quiet > nas-dupes.json

The CLI never deletes files — it produces a report only. Parse the JSON in your script and decide what to remove manually or via another tool. See the CLI reference for all options.

To automate weekly scans, wire the command into cron, launchd, or systemd:

0 3 * * 0 duobolt-cli /Volumes/NAS --output=json --quiet > ~/reports/nas-dupes-$(date +\%Y\%m\%d).json

Supported NAS Platforms

DuoBolt works with any share the host OS can mount — in the typical setup it runs on a desktop or server that has the NAS share mounted, not on the NAS itself. Tested and known-good:

Synology — DS, RS, FS series (DSM 6 and 7)
QNAP — TS, TVS, TBS series (QTS and QuTS)
TrueNAS — CORE and SCALE (SMB and NFS)
Unraid — via SMB shares
Asustor, Terramaster, Ugreen — any SMB/AFP/NFS share
Custom Linux file servers — Samba, NFS, or SSHFS mounts

For SMB shares, DuoBolt works transparently through the mount — there is no protocol-level optimization needed.

Running the CLI directly on the NAS

Most of the platforms above run a Linux-based OS and can execute the Linux CLI build directly over SSH, skipping the network mount. Windows-based appliances (rare) can use the Windows build instead. See the Download page for the full list of binaries.

To pick the right build, SSH into the NAS and run uname -m:

`uname -m` output	Build to download
`x86_64`	Linux (x64)
`aarch64` / `arm64`	Linux (ARM64)
`armv7l` / `armv6l`	Not supported

Rough guidance if you don’t want to check first:

x64 only — TrueNAS CORE/SCALE, Unraid, current Ugreen UGOS.
Mostly x64 — Synology ”+”, RS, and FS series; QNAP TVS and TBS; higher-end QNAP TS.
Mostly ARM64 — Synology entry-level (“j” / “play” / value DS); lower-end QNAP TS; entry-level Asustor and Terramaster.

Protocol Choice — SMB, NFS, AFP, SSHFS

DuoBolt does not speak network protocols directly. It scans through whatever the host operating system mounts, so the protocol choice lives entirely on the mount side — but it materially affects scan throughput, especially on long cold scans.

SMB (SMB3) — the default on macOS and Windows for Synology, QNAP, TrueNAS, Unraid, and any modern share. Supports multichannel, encryption, and signing. The best all-around choice for most users and the configuration our benchmarks run on.
NFS (v4) — common on Linux hosts and ZFS-backed NAS. Slightly lower per-operation latency than SMB in many setups, at the cost of trickier UID/GID permission semantics. If both host and NAS run Linux, NFS frequently edges out SMB on large scans.
AFP — deprecated. Apple dropped it as Time Machine’s default in macOS 11. Read-only compatibility remains but it is slow; avoid as a primary mount.
SSHFS — convenient (any machine with SSH access works), but the per-file round-trip overhead is punishing on scans of tens of thousands of small files. Usable for ad-hoc scans; a poor default.
iSCSI / block-level mounts — appear as local disks to the OS and to DuoBolt. Fast, but the target is exclusive to one client at a time.

When in doubt: SMB3. NFS only if you have a specific reason.

What to Exclude from a NAS Scan

Not everything on a NAS that looks like a duplicate is one. Filesystems, backup tools, and media apps routinely store intentional hardlinks, snapshots, or managed metadata that inflate scan counts without representing real redundancy. Knowing what to skip shortens the scan and produces cleaner results.

Synology:

@eaDir/ — hidden thumbnail and metadata directory created inside every media folder. Useless to hash. Exclude with --exclude-dir-ext=eaDir or via Desktop ignore rules.
#recycle/ — per-share Recycle Bin. Often holds intentional duplicates of “deleted” files; decide based on your cleanup goals.
@appstore/, @database/, @tmp/ — system-managed, always skip.

Filesystem snapshots (ZFS, BTRFS, Synology Btrfs):

.snapshots/, .zfs/snapshot/ — point-in-time copies of the entire volume. They look like massive duplication but represent historical state. Always exclude; otherwise every snapshot doubles the apparent duplicate count.
APFS snapshots are invisible at the file level on macOS hosts, so no action needed.

Backup bundles:

.backupdb/ (Time Machine) — relies on OS-level hardlinks. DuoBolt’s symlink/hardlink collapse handles this correctly by default, but excluding the directory saves a lot of scan time.
.AppleDouble/ and ._* AppleDouble sidecars — generated by macOS on non-HFS filesystems. Covered by --ignore-system-files.

Managed library bundles (macOS):

*.photoslibrary, *.aplibrary, *.itlp, *.imovielibrary — appear as single files in Finder but are massive directory trees internally. Exclude by directory extension: --exclude-dir-ext=photoslibrary,aplibrary,itlp,imovielibrary.

Cross-platform noise:

Thumbs.db, desktop.ini, .DS_Store — always safe to skip. Covered by --ignore-system-files.

Most of the above can be handled in one pass by toggling Ignore system files, Ignore hidden directories, and Exclude directory extensions in the Desktop app, or by combining --ignore-system-files with --exclude-dir-ext=... in the CLI.

Performance Tips for NAS Scans

Warm the cache on the first pass. Cold-scan timings are not representative of steady state; the second run is what day-to-day workflow feels like.
Set a sensible --min-size. Thumbnails, sidecar metadata, and app config files are rarely worth hashing and inflate scan counts without reclaiming space.
Exclude managed library bundles (.photoslibrary, .aplibrary, .itlp). These look massive but are not user-facing duplicates.
Scan during off-peak hours if the NAS is also handling Time Machine backups, streaming, or other heavy workloads — network contention kills scan throughput.
For periodic audits, script the CLI and pipe results into a dashboard. DuoBolt CLI outputs clean JSON/CSV built for automation.

Next Steps

Download DuoBolt Desktop for macOS & Windows, free CLI for macOS, Windows, and Linux

Performance Benchmarks Full comparison against Czkawka, dupeGuru, and Gemini 2

CLI Usage Automate scheduled NAS scans via cron, launchd, or Task Scheduler

FAQ Safe deletion, permissions, and common NAS scanning questions