Skip to content

Nibabel is 10x faster than SimpleITK in loading .nii.gz -- let's use that #3001

@edomerli

Description

@edomerli

On my own code a few days ago I noticed that np.load() on a compressed .npz 3D volume was ~50x slower than nibabel's nib.load() on the same .nii.gz volume, despite using the same compression level.

I then run the below benchmark to compare loading times of numpy .npz, nibabel and sitk (generated by Claude since I didn't have the time to write such a nice benchmark script).
I'm using Python 3.11.7

"""
Benchmark: nibabel vs SimpleITK vs numpy .npz on .nii.gz loading.

To test on a real file, set REAL_NII_PATH to its path.
Otherwise a synthetic volume is generated in a temp directory.
"""

import time, tempfile, os
import numpy as np
import nibabel as nib
import SimpleITK as sitk

# ── Configuration ─────────────────────────────────────────────────────────────
REAL_NII_PATH = None              # e.g. "/data/ct_scan.nii.gz", or None for synthetic
SHAPE         = (403, 403, 431)   # synthetic volume shape (ignored if REAL_NII_PATH set)
COMPRESS_LVL  = 6
N_RUNS        = 3
# ─────────────────────────────────────────────────────────────────────────────


def benchmark(nii_path, tmp_dir, ref_data=None):
    npz_path = os.path.join(tmp_dir, "volume.npz")

    # Build .npz counterpart from the nii data
    arr_from_nii = np.asarray(nib.load(nii_path).dataobj)
    np.savez_compressed(npz_path, volume=arr_from_nii)

    print(f"Shape of 3D Volume: {arr_from_nii.shape}")
    print(f"  .nii.gz : {os.path.getsize(nii_path)/1e6:.1f} MB")
    print(f"  .npz    : {os.path.getsize(npz_path)/1e6:.1f} MB\n")

    nii_times, sitk_times, npz_times = [], [], []

    for i in range(N_RUNS):
        t0 = time.perf_counter(); arr_nii  = np.asarray(nib.load(nii_path).dataobj); nii_times.append(time.perf_counter() - t0)
        t0 = time.perf_counter(); arr_sitk = sitk.GetArrayFromImage(sitk.ReadImage(nii_path)); sitk_times.append(time.perf_counter() - t0)
        t0 = time.perf_counter(); arr_npz  = np.load(npz_path)["volume"]; npz_times.append(time.perf_counter() - t0)
        print(f"  run {i+1}:  nibabel {nii_times[-1]:.2f}s  |  SimpleITK {sitk_times[-1]:.2f}s  |  npz {npz_times[-1]:.2f}s")

    mn, ms, mz = np.mean(nii_times), np.mean(sitk_times), np.mean(npz_times)
    print(f"\n  nibabel   {mn:.2f}s ± {np.std(nii_times):.2f}s")
    print(f"  SimpleITK {ms:.2f}s ± {np.std(sitk_times):.2f}s")
    print(f"  npz       {mz:.2f}s ± {np.std(npz_times):.2f}s")

    print(f"  SimpleITK : {ms / mn:.1f}x slower than nibabel")
    print(f"  npz:        {mz / mn:.1f}x slower than nibabel")

    # Correctness (SimpleITK is z,y,x so transpose before comparing)
    ok_npz  = np.allclose(arr_nii.astype(np.float32), arr_npz.astype(np.float32),              atol=1e-5)
    ok_sitk = np.allclose(arr_nii.astype(np.float32), arr_sitk.transpose().astype(np.float32), atol=1e-5)
    print(f"\n  nibabel == npz       : {'✅' if ok_npz  else '❌'}")
    print(f"  nibabel == SimpleITK : {'✅' if ok_sitk else '❌'}")


if __name__ == "__main__":
    print(f"numpy {np.__version__} | nibabel {nib.__version__} | SimpleITK {sitk.Version.VersionString()}\n")

    with tempfile.TemporaryDirectory() as tmp:
        if REAL_NII_PATH:
            print(f"Real file: {REAL_NII_PATH}\n")
            benchmark(REAL_NII_PATH, tmp)
        else:
            nii_path = os.path.join(tmp, "volume.nii.gz")
            print(f"Synthetic volume {SHAPE} float32 — saving …", flush=True)
            data = np.random.default_rng(0).standard_normal(SHAPE).astype(np.float32)
            nib.openers.Opener.default_compresslevel = COMPRESS_LVL
            nib.save(nib.Nifti1Image(data, np.eye(4)), nii_path)
            print()
            benchmark(nii_path, tmp, ref_data=data)

Resulting in:

numpy 2.4.3 | nibabel 5.4.0 | SimpleITK 2.5.3

Synthetic volume (403, 403, 431) float32 — saving …

Shape of 3D Volume: (403, 403, 431)
  .nii.gz : 259.1 MB
  .npz    : 259.1 MB

  run 1:  nibabel 1.41s  |  SimpleITK 7.53s  |  npz 6.43s
  run 2:  nibabel 1.42s  |  SimpleITK 6.16s  |  npz 7.49s
  run 3:  nibabel 1.46s  |  SimpleITK 8.61s  |  npz 7.71s

  nibabel   1.43s ± 0.02s
  SimpleITK 7.43s ± 1.00s
  npz       7.21s ± 0.56s
  SimpleITK : 5.2x slower than nibabel
  npz:        5.0x slower than nibabel

  nibabel == npz       : ✅
  nibabel == SimpleITK : ✅

If I repeat it with real volume taken from my dataset (Merlin) the output is:

numpy 2.4.3 | nibabel 5.4.0 | SimpleITK 2.5.3

Real file: ...my path...

Shape of 3D Volume: (512, 512, 420)
  .nii.gz : 129.1 MB
  .npz    : 126.0 MB

  run 1:  nibabel 1.33s  |  SimpleITK 21.45s  |  npz 17.52s
  run 2:  nibabel 1.34s  |  SimpleITK 8.98s  |  npz 10.20s
  run 3:  nibabel 1.38s  |  SimpleITK 17.17s  |  npz 16.48s

  nibabel   1.35s ± 0.02s
  SimpleITK 15.87s ± 5.17s
  npz       14.73s ± 3.24s
  SimpleITK : 11.7x slower than nibabel
  npz:        10.9x slower than nibabel

  nibabel == npz       : ✅
  nibabel == SimpleITK : ✅

The 50x speedup was probably due to an abnormally large volume. I picked a .nii.gz file at random today.

Reading your codebase, I realized that you use SimpleITK as your default .nii.gz reader (SimpleITKIO is above NibabelIOWithReorient in LIST_OF_IO_CLASSES, file reader_writer_registry.py).
Given the benchmarking results, I suggest to change LIST_OF_IO_CLASSES from:

LIST_OF_IO_CLASSES = [
    NaturalImage2DIO,
    SimpleITKIO,
    Tiff3DIO,
    NibabelIO,
    NibabelIOWithReorient
]

to:

LIST_OF_IO_CLASSES = [
    NaturalImage2DIO,
    NibabelIOWithReorient,
    SimpleITKIO,
    Tiff3DIO,
    NibabelIO
]

I would leave NibabelIO last due to the axis rearrangement necessary to work with the sitk preprocessing you do later.

I hope this can alleviate the CPU stress when preprocessing large datasets composed of large images.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions