Linux Checksum Tools: Performance Comparison

ByEric Ma Posted onSep 5, 2015Apr 12, 2026 Updated onApr 12, 2026

When verifying file integrity on Linux, you’ll typically reach for one of several checksum utilities. A natural question: which one is fastest? The answer depends heavily on whether I/O or CPU is your bottleneck.

Disk I/O Dominates on Most Systems

In realistic scenarios with typical storage, disk I/O is almost always the limiting factor. Test this with a large file:

$ ls -lh largefile.bin
-rw-r--r-- 1 user user 15G Dec 15 10:28 largefile.bin

Testing three common checksums:

$ time sha256sum largefile.bin
a1b2c3d4e5f6... largefile.bin
real    1m21.143s
user    0m21.647s
sys     0m4.668s

$ time md5sum largefile.bin
e2e649030c795ffa9f33a99bcb39dde7  largefile.bin
real    1m27.392s
user    0m25.563s
sys     0m3.936s

$ time b2sum largefile.bin
abc123def456... largefile.bin
real    1m18.205s
user    0m19.341s
sys     0m4.782s

The differences are marginal — all three take roughly 80-90 seconds. Measure raw I/O performance:

$ time dd if=largefile.bin of=/dev/null bs=1M
15000+0 records in
15000+0 records out
15728640000 bytes (16 GB, 15 GiB) copied, 80.4203 s, 199 MB/s
real    1m20.447s
user    0m0.202s
sys     0m7.091s

The disk I/O alone accounts for 80 seconds. The checksum computation is nearly invisible. On spinning disk or SATA SSD, disk I/O is the bottleneck — not the algorithm.

CPU-Limited Benchmarking

If you have fast NVMe or high-throughput storage, disk I/O becomes less of a constraint. Algorithm efficiency then matters. Create a test file in a RAM disk to eliminate I/O latency:

$ mkdir -p /mnt/ramdisk
$ mount -t tmpfs -o size=8G tmpfs /mnt/ramdisk
$ head -c 3G /dev/zero > /mnt/ramdisk/testfile
$ for algo in md5sum sha256sum b2sum; do 
    echo "=== $algo ===" 
    time $algo /mnt/ramdisk/testfile
  done

Results on a modern 12-core system:

=== md5sum ===
real    0m5.103s
user    0m4.697s
sys     0m0.409s

=== sha256sum ===
real    0m8.451s
user    0m8.082s
sys     0m0.372s

=== b2sum ===
real    0m3.205s
user    0m2.931s
sys     0m0.482s

In CPU-limited scenarios (3GB of RAM disk data):

b2sum: ~937 MB/s
md5sum: ~587 MB/s
sha256sum: ~355 MB/s

Algorithm Security Properties

When choosing a checksum tool, speed shouldn’t be your primary concern. Security properties matter more:

MD5: Cryptographically broken. Vulnerable to collision attacks. Don’t use for integrity verification of untrusted sources.
SHA-1: Theoretically broken. Still useful for non-adversarial integrity checks, but avoid for security-sensitive work.
SHA-256: Industry standard. No known practical attacks. Good choice for most scenarios.
BLAKE2b/BLAKE3: Modern, fast, and cryptographically secure. Excellent for new projects and high-performance requirements.

Practical Commands

For most file verification tasks:

$ sha256sum myfile

For high-performance environments (NVMe, fast storage):

$ b2sum myfile

For recursive directory verification:

$ find /path -type f -exec sha256sum {} + > checksums.txt
$ sha256sum -c checksums.txt

To parallelize checksum computation across multiple files:

$ parallel sha256sum {} ::: file1 file2 file3 file4

Or with GNU xargs:

$ find /path -type f -print0 | xargs -0 -P 4 -I {} sha256sum {}

The -P 4 flag uses 4 parallel processes. Adjust based on your CPU core count.

Verifying Checksums in Batch

When you have a pre-generated checksum file (e.g., from a download):

$ sha256sum -c SHA256SUMS
file1: OK
file2: OK
file3: FAILED

The tool automatically detects the algorithm from the checksum format. You can verify specific files only:

$ sha256sum -c SHA256SUMS --ignore-missing

When Speed Actually Matters

If you’re computing checksums on multi-terabyte datasets frequently:

Upgrade your storage first. Move from spinning disk to NVMe. This provides 5-10x performance gains and typically costs less than the salary time spent optimizing code.

Use parallel processing. Leverage multiple CPU cores for independent files. The parallel command or GNU xargs -P scales naturally.

Choose BLAKE2b or BLAKE3. These provide 2-3x better throughput than SHA-256 on modern CPUs without sacrificing cryptographic security. BLAKE3 is particularly attractive for new projects—it’s even faster than BLAKE2b and supports parallel hashing of large files.

For very large single files on modern NVMe, consider BLAKE3’s native parallel mode:

$ b3sum largefile.bin  # Uses all CPU cores automatically

The Bottom Line

On typical systems with mechanical or SATA storage, your checksum tool choice has minimal impact. Disk I/O is the overwhelming bottleneck. Only when storage approaches 1GB/s throughput does algorithm efficiency become meaningful. In those cases, BLAKE2b/BLAKE3 pull ahead, but SHA-256 remains a practical choice for most deployments.

9 Comments

Pádraig Brady says:

Sep 5, 2015 at 9:27 pm

On your system the bottleneck is disk, though increasingly this is moving up with the advance of SSDs. In that case the computational overhead becomes significant. You can quantify that on your system by avoid disk with something like:

for chk in crc32 md5sum sha1sum; do time head -c 1G /dev/zero | $chk; done

Note that sha1sum and md5sum use system specific instructions for significant speedups on systems congifured –with-openssl (as is the default on arch, fedora, centos7, gentoo at least).

Reply
1. Eric Zhiqiang Ma says:
  
  Sep 5, 2015 at 10:43 pm
  
  Hi Pádraig,
  
  That’s a good point.
  
  I did some tests on the same system (Fedora 22 x86-64) by doing checksums on a file under /dev/shm/. The results you can find in http://www.systutorials.com/136737/which-checksum-tool-on-linux-is-faster/#what-if-i.2Fo-was-not-the-bottleneck . crc32 turns to be the fastest one.
  
  Reply
Simard says:

Oct 31, 2015 at 4:16 am

Hi Eric,

I hope that you know that collisions exist in crc32, md5sum and even sha-0 checksums. But not yet for sha-1 which you actually used. Since I found these collision problems, I only use sha1sum and better (sha224, sha256, sha384 or sha512) for my verifications when I can.

http://preshing.com/20110504/hash-collision-probabilities/
http://www.mathstat.dal.ca/~selinger/md5collision/
https://en.wikipedia.org/wiki/SHA-0

Nice and informative website by the way.

Reply
1. Eric Zhiqiang Ma says:
  
  Nov 2, 2015 at 4:09 pm
  
  Hi Simard,
  
  Thanks!
  
  Although this post is mainly talking about the speed, that’s a good point taking the collisions into consideration. I will add a note in the post mentioning your comment.
  
  Reply
desromic says:

Apr 5, 2016 at 11:39 pm

‘sum -s filename’ is significantly faster than all of these.

uni@box:~$ ls -lh kali-linux-1.0.3-i386.iso
-rwxrwxrwx 1 uni uni 2.3G Jun 22 2013 kali-linux-1.0.3-i386.iso
uni@box:~$ time crc32 kali-linux-1.0.3-i386.iso
bd3a7323

real 0m12.701s
user 0m5.263s
sys 0m1.033s
uni@box:~$ time sum kali-linux-1.0.3-i386.iso
11559 2387392

real 0m4.270s
user 0m3.986s
sys 0m0.280s
uni@box:~$ time sum -s kali-linux-1.0.3-i386.iso
47724 4774784 kali-linux-1.0.3-i386.iso

real 0m1.241s
user 0m0.972s
sys 0m0.268s
uni@box:~$ sum –version|head -1
sum (GNU coreutils) 8.21
uni@box:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 42
Stepping: 7
CPU MHz: 1674.878
BogoMIPS: 6600.22
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-3

Reply
1. Eric Zhiqiang Ma says:
  
  Apr 7, 2016 at 1:14 pm
  
  Nice to know these numbers desromic!
  
  The `cksum`/`sum -s` which is also a CRC tool as `crc32` seems much faster thatn `crc32`.
  
  Reply
Eric Zhiqiang Ma says:

Apr 7, 2016 at 1:15 pm

One note that the CRC algorithms have the same problems being “useless as secure indicator of intentional manipulation of the data” as discussed in

Simard’s comment http://www.systutorials.com/136737/which-checksum-tool-on-linux-is-faster/#comment-76996 and also discussions at http://www.derkeiler.com/Newsgroups/sci.crypt/2003-07/1451.html :

While properly designed CRC’s are good at detecting random errors in
the data (due to e.g. line noise), the CRC is useless as a secure
indicator of intentional manipulation of the data. And this is
because it’s not hard at all to modify the data to produce any CRC
you desire (e.g. the same CRC as the original data, to try to
disguise your data manipulation).

Reply
David McLaughlin says:

Jun 15, 2019 at 12:04 am

For many years I have found md5sum to consistently be faster than sha1sum, so I was very surprised when I read this article.
I just tried it again on a file of size 295G and got this:
md5sum
real 10m20.952s
the same file for
sha1sum
real 15m15.332s

This is consistent with what I seem to always see.

Reply
1. Eric Z Ma says:
  
  Jun 23, 2019 at 10:02 pm
  
  Thanks for sharing the numbers. My inference is it depends on the machine used. The CPU, memory and disks taking together matter (assuming the good enough optimization already applied in the implementation). It is okay to assume disk I/O (e.g. SSD) could be faster enough to sustain the CPU and memory. Then the main factors are CPU and memory. md5sum size is smaller and likely has less pressure to the memory and memory bus systems. However, the modern CPU architecture and software implementation seems have better optimizations for sha1sum (sha) computation.
  
  I do not have the original machine I did the test any more. But would you like to do the test as in the post on the machine you used to see how it performs and also share with us the machine details? Thanks.
  
  Reply