Simple chunked pedigree kinship #1282

timothymillar · 2025-01-07T01:43:05Z

Fixes Chunked pedigree kinship #1280
Fixes MemoryError: Allocation failed (probably too large) with ped.compute() #1213
Tests added
User visible changes (including notable bug fixes) are documented in changelog.rst

I've tested this with a real pedigree of ~55,000 individuals on a 4-core laptop in WSL2. I can calculate the pedigree matrix using chunks of 5000 samples and save the chunked matrix to a Zarr store in a total of < 25s using ~2.5GB of RAM. The full matrix would be ~22.5GB which exceeds the memory of this machine.

I've used the @jitclass experimental feature from Numba for a simple triangular matrix class. Using a triangular matrix halves the RAM needed for the intermediate matrices. It's not strictly necessary to use @jitclass for this but it allows for greater code reuse via custom __setitem__/__getitem__. If this is an issue it could be reworked to avoid @jitclass.

I've also removed the test runs with NUMBA_DISABLE_JIT: 1 because this introduces a dependency on @guvectorize and @jitclass in pedigree.py.

jeromekelleher

LGTM

timothymillar · 2025-01-12T19:48:02Z

Thanks @jeromekelleher, I assume we're not worried about the Cubed and Zarr 3 test runs failing for now?

jeromekelleher · 2025-01-13T09:39:34Z

I didn't see those @tomwhite, thoughts here?

tomwhite · 2025-01-13T09:46:25Z

I didn't see those @tomwhite, thoughts here?

They are not related to this PR, so OK to merge this if it's ready. I'll be looking at the Zarr 3 changes today.

jeromekelleher · 2025-01-13T09:48:03Z

Happy to merge when you are @timothymillar

timothymillar added 3 commits January 7, 2025 14:06

Allow chunking in pedigree_kinship sgkit-dev#1280

3320e12

Remove CI tests of pedigree functions without JIT

4e10560

Update changelog

dfacca8

jeromekelleher approved these changes Jan 9, 2025

View reviewed changes

timothymillar added the auto-merge Auto merge label for mergify test flight label Jan 13, 2025

mergify bot merged commit 5b96476 into sgkit-dev:main Jan 13, 2025
10 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Simple chunked pedigree kinship #1282

Simple chunked pedigree kinship #1282

Uh oh!

timothymillar commented Jan 7, 2025 •

edited

Loading

Uh oh!

jeromekelleher left a comment

Uh oh!

timothymillar commented Jan 12, 2025

Uh oh!

jeromekelleher commented Jan 13, 2025

Uh oh!

tomwhite commented Jan 13, 2025

Uh oh!

jeromekelleher commented Jan 13, 2025

Uh oh!

Uh oh!

Uh oh!

Simple chunked pedigree kinship #1282

Simple chunked pedigree kinship #1282

Uh oh!

Conversation

timothymillar commented Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeromekelleher left a comment

Choose a reason for hiding this comment

Uh oh!

timothymillar commented Jan 12, 2025

Uh oh!

jeromekelleher commented Jan 13, 2025

Uh oh!

tomwhite commented Jan 13, 2025

Uh oh!

jeromekelleher commented Jan 13, 2025

Uh oh!

Uh oh!

Uh oh!

timothymillar commented Jan 7, 2025 •

edited

Loading