Skip to content

Conversation

@jhlegarreta
Copy link
Contributor

@jhlegarreta jhlegarreta commented Nov 17, 2025

Refactor gradient data checks in DWI data class: PR #325 introduced the row-major convention for gradient data in NiFreeze. However, gradient loading was not tested thoroughly. This resulted in some execution flows that would not guarantee a row-major internal convention or would crash under some circumstances.

This commit refactors the gradient data checks and adds thorough testing:

  • Define all error or warning messages as global variables so that they can be checked exactly in tests.
  • Ensure that gradients conform to the row-major convention when instantiating the DWI class directly. This allows to separate the gradient reformatting from the dimensionality check with the DWI volume sequence. This simplifies the flow, as the gradient reformatting to the row-major convention does not depend on the number of volumes in the DWI sequence. Also, this makes the flow more consistent with the refactored checks of the NIfTI file-based loading utility function (from_nii).
  • Ensure that the gradients conform to the row-major convention immediately after loading the gradients file in the NIfTI file-based loading utility function (from_nii). As opposed to the previous implementation, this allows to load the gradients from a file where data follows either column-major or row-major convention. e.g. In the previous implementation the if grad.shape[1] < 2: was making an assumption about the layout and/or one that was wrong because we require 4 columns (direction (x,y,z) + b-value) or rows (if in column-major). The new implementation simplifies the execution flow.

@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch 3 times, most recently from e406487 to a378f89 Compare November 17, 2025 00:15
@codecov
Copy link

codecov bot commented Nov 17, 2025

Codecov Report

❌ Patch coverage is 94.00000% with 3 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@3bca8be). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/nifreeze/data/dmri.py 94.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #327   +/-   ##
=======================================
  Coverage        ?   78.85%           
=======================================
  Files           ?       28           
  Lines           ?     1887           
  Branches        ?      199           
=======================================
  Hits            ?     1488           
  Misses          ?      339           
  Partials        ?       60           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch from a378f89 to 6cd6e78 Compare November 17, 2025 00:32
@jhlegarreta jhlegarreta requested a review from oesteban November 17, 2025 00:52
@jhlegarreta
Copy link
Contributor Author

jhlegarreta commented Nov 17, 2025

A few notes:

  • if self.dataobj is not None:
    should not be necessary/dataobj would not be allowed to be None after PR ENH: Validate data objects' attributes at instantiation #305 gets merged.
  • This line is never hit
    bvecs = bvecs[np.newaxis, :]

    I'd assume that this could be [x1 y1 z1 x2 y2 z2 ... xN yN zN] (and its .T counterpart) gradient file. I'd say that reshaping that to an Nx3 that array was not implemented in the previous implementation, and I have not implemented it. Same goes if we additionally have the b-value: [x1 y1 z1 b1 x2 y2 z2 b2 ... xN yN zN bN] . Would leave that for a separate PR.
  • _normalize_gradients: it looks a misleading name now, as it does not normalize gradients. By the way, we should normalize gradients as well. That function looks the appropriate place, but as it performs other reformatting operations, we should come up with a more descriptive/comprehensive name. I've written the test for when we will tackle the gradient normalization: https://gist.github.com/jhlegarreta/9ffab029e96459846eaea27936e47c33
  • The use of the b0s surfaces here again: when instantiating a DWI object directly (e.g. without relying on the utility function from_nii), some user may very well do
   dwi = DWI(dataobj=dwi_dataobj, bzero=b0_dataobj, gradients=gradients)
   assert np.allclose(gradients, dwi.gradients)    

where gradients contain the b0s, these do not get filtered/masked when setting the gradients attribute, and the above statement does not result in a failure: the null gradients simply do not get masked (for example, the constructor does not use/accept b0_thres). The assertion would work. However, when using the from_nii utility function to build the DWI instance, we would do, e.g.:

    dwi = from_nii(dwi_fname, gradients_file=grads_fname)
    assert np.allclose(gradients[gradmask], dwi.gradients)

In the direct instantiation case, dwi.gradients would be e.g. 12x4, which would not agree with what we would get when using the from_nii function, i.e. 10x4 (bzeros are removed).

The above is an inconsistency that needs to be addressed: if we input the same the same data (whether as files or as unprocessed arrays), we should get two instances that hold exactly the same information. I know that for the instantiation case, we should not be providing the b0s in the gradients object, but we cannot guarantee that it will not happen (as it is also the most natural/common thing outside our particular context), even if we document the expected behavior very well. Here is a testing function for that: https://gist.github.com/jhlegarreta/1a3fbbfab362a2f2dbd28a8c53c54c08

Although _normalize_gradients looks an appropriate candidate to remove the b0s, in the from_nii function it is called
https://github.com/nipreps/nifreeze/pull/327/files#diff-774e30bff24e702cda5516cbc6bbbcae1af21b710e2b0ad22cc15741fbed9afeR430
once the filtering has taken place
https://github.com/nipreps/nifreeze/pull/327/files#diff-774e30bff24e702cda5516cbc6bbbcae1af21b710e2b0ad22cc15741fbed9afeR439

which would mean that we may attempt to do the same job twice. Whatever approach we take to solve this, that should be addressed in a separate PR.

@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch 2 times, most recently from eecba2a to c250888 Compare November 17, 2025 23:59
@jhlegarreta
Copy link
Contributor Author

If the changes in this PR are uncontroversial, and not introducing hidden bugs, can we please merge it? The further refactoring to complete #327 (comment) become necessary in PR #305 (it can already be seen what these would look like).

@oesteban
Copy link
Member

If the changes in this PR are uncontroversial, and not introducing hidden bugs, can we please merge it? The further refactoring to complete #327 (comment) become necessary in PR #305 (it can already be seen what these would look like).

I'm working on this one 👍, please hold on.

oesteban added a commit to oesteban/nifreeze that referenced this pull request Nov 19, 2025
This patch
- moves the responsibility of formatting and validating the
  gradients into the attrs' infrastructure;
- removes complexity from ``from_nifti()`` that was completely
  unnecessary (e.g., motion file can be set after creating the DWI
  object).
- updates tests
- skips one test that is unreasonably slow, we need to look into what's
  going on with ``to_nifti()``.
jhlegarreta added a commit to jhlegarreta/nifreeze that referenced this pull request Nov 20, 2025
jhlegarreta pushed a commit to jhlegarreta/nifreeze that referenced this pull request Nov 20, 2025
This patch
- moves the responsibility of formatting and validating the
  gradients into the attrs' infrastructure;
- removes complexity from ``from_nifti()`` that was completely
  unnecessary (e.g., motion file can be set after creating the DWI
  object).
- updates tests
- skips one test that is unreasonably slow, we need to look into what's
  going on with ``to_nifti()``.
@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch from ee02d74 to 940ed08 Compare November 20, 2025 14:00
jhlegarreta pushed a commit to jhlegarreta/nifreeze that referenced this pull request Nov 21, 2025
This patch
- moves the responsibility of formatting and validating the
  gradients into the attrs' infrastructure;
- removes complexity from ``from_nifti()`` that was completely
  unnecessary (e.g., motion file can be set after creating the DWI
  object).
- updates tests
- skips one test that is unreasonably slow, we need to look into what's
  going on with ``to_nifti()``.
@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch from 940ed08 to 2a57a73 Compare November 21, 2025 00:25
jhlegarreta pushed a commit to jhlegarreta/nifreeze that referenced this pull request Nov 21, 2025
This patch
- moves the responsibility of formatting and validating the
  gradients into the attrs' infrastructure;
- removes complexity from ``from_nifti()`` that was completely
  unnecessary (e.g., motion file can be set after creating the DWI
  object).
- updates tests
- skips one test that is unreasonably slow, we need to look into what's
  going on with ``to_nifti()``.
@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch from 2a57a73 to 3686a5e Compare November 21, 2025 00:30
@jhlegarreta
Copy link
Contributor Author

Leaving this error to you @oesteban 😔. Thanks.

jhlegarreta pushed a commit to jhlegarreta/nifreeze that referenced this pull request Nov 21, 2025
This patch
- moves the responsibility of formatting and validating the
  gradients into the attrs' infrastructure;
- removes complexity from ``from_nifti()`` that was completely
  unnecessary (e.g., motion file can be set after creating the DWI
  object).
- updates tests
- skips one test that is unreasonably slow, we need to look into what's
  going on with ``to_nifti()``.
@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch from 14d50ac to cfff5bb Compare November 21, 2025 13:12
jhlegarreta pushed a commit to jhlegarreta/nifreeze that referenced this pull request Nov 22, 2025
This patch
- moves the responsibility of formatting and validating the
  gradients into the attrs' infrastructure;
- removes complexity from ``from_nifti()`` that was completely
  unnecessary (e.g., motion file can be set after creating the DWI
  object).
- updates tests
- skips one test that is unreasonably slow, we need to look into what's
  going on with ``to_nifti()``.
@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch from cfff5bb to c782ac6 Compare November 22, 2025 01:08
@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch 2 times, most recently from 6ab0516 to f1ebf08 Compare November 22, 2025 01:26
@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch from f1ebf08 to da1e127 Compare November 22, 2025 01:27
@jhlegarreta
Copy link
Contributor Author

Re: #332 (comment) So, the relevant changes from PR #305 are brought here. PR #305 can be safely closed.

@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch from 03dd2c6 to de9c64d Compare November 22, 2025 15:36
jhlegarreta and others added 6 commits November 22, 2025 10:45
Refactor gradient data checks in DWI data class: PR
nipreps#325 introduced the row-major
convention for gradient data in NiFreeze. However, gradient loading was
not tested thoroughly. This resulted in some execution flows that would
not guarantee a row-major internal convention or would crash under some
circumstances.

This commit refactors the gradient data checks and adds thorough
testing:
- Define all error or warning messages as global variables so that they
  can be checked exactly in tests.
- Ensure that gradients conform to the row-major convention when
  instantiating the DWI class directly. This allows to separate the
  gradient reformatting from the dimensionality check with the DWI
  volume sequence. This simplifies the flow, as the gradient
  reformatting to the row-major convention does not depend on the number
  of volumes in the DWI sequence. Also, this makes the flow more
  consistent with the refactored checks of the NIfTI file-based loading
  utility function (`from_nii`).
- Ensure that the gradients conform to the row-major convention
  immediately after loading the gradients file in the NIfTI file-based
  loading utility function (`from_nii`). As opposed to the previous
  implementation, this allows to load the gradients from a file where
  data follows either column-major or row-major convention. e.g. In the
  previous implementation the `if grad.shape[1] < 2:` was making an
  assumption about the layout and/or one that was wrong because we
  require 4 columns (direction (x,y,z) + b-value) or rows (if in
  column-major). The new implementation simplifies the execution flow.
This patch
- moves the responsibility of formatting and validating the
  gradients into the attrs' infrastructure;
- removes complexity from ``from_nifti()`` that was completely
  unnecessary (e.g., motion file can be set after creating the DWI
  object).
- updates tests
- skips one test that is unreasonably slow, we need to look into what's
  going on with ``to_nifti()``.
Improve gradient formatting function robustness by adopting a more
defensive approach:
- Raise if gradients are `None`.
- Raise if gradients are not a numeric homogeneous array-like object.

Add the corresponding tests.

Take advantage of the commit to test the `validate_gradients` function.
Add additional object instantiation equality checks: check that objects
intantiated through reading NIfTI files equal objects instantiated
directly.
@jhlegarreta jhlegarreta force-pushed the ref/refactor-dwi-gradient-checks branch from de9c64d to 7933b75 Compare November 22, 2025 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ensure the dataset mandatory attributes are present in data instantiation

2 participants