Skip to content

Conversation

@bthirion
Copy link
Collaborator

One setp toward #306 : populating the high dimension documentation module.

@bthirion bthirion marked this pull request as draft September 15, 2025 15:56
@codecov
Copy link

codecov bot commented Sep 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.28%. Comparing base (d84ea41) to head (433943c).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #425   +/-   ##
=======================================
  Coverage   99.28%   99.28%           
=======================================
  Files          23       23           
  Lines        1253     1253           
=======================================
  Hits         1244     1244           
  Misses          9        9           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lionelkusch lionelkusch changed the title Very high dim [DOC] Very high dim Oct 1, 2025
@bthirion bthirion marked this pull request as ready for review October 17, 2025 09:52
Copy link
Collaborator

@jpaillard jpaillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add a figure from an example, like the fMRI or the 2D simulation?


# compute estimated support
selected_cdl = np.logical_or(pval_corr < alpha, one_minus_pval_corr < alpha)
print(f'Clustered Desparsified Lasso selected {np.sum(selected_cdl)} features among {np.sum(beta > 0)} ')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not executed in the rendered doc. Is there a way to run the code snippets?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the user guide, I don't think that it's a good idea to have code.
In my opinion, it's better to have an example associated with it and make reference to it.


As hinted in :footcite:t:`meinshausen2009pvalues` an efficient way to deal with such configuration is to take the per-group average of the features: this leads to a *reduced design*. After inference, all the feature in a given group obtain the p-value of the group representative. When the inference engine is Desparsified Lasso, the resulting method is called Clustered Desparsified lasso, or **CluDL**.

Using the same example as previously, we start by defining a clustering method that will perform the grouping. For image data, Ward clustering is a good default model::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a bit more details explaining that the clustering is constrained by pixel's neighborhood?

* The :ref:`Grouping documentation <grouping>`


As hinted in :footcite:t:`meinshausen2009pvalues` an efficient way to deal with such configuration is to take the per-group average of the features: this leads to a *reduced design*. After inference, all the feature in a given group obtain the p-value of the group representative. When the inference engine is Desparsified Lasso, the resulting method is called Clustered Desparsified lasso, or **CluDL**.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As hinted in :footcite:t:`meinshausen2009pvalues` an efficient way to deal with such configuration is to take the per-group average of the features: this leads to a *reduced design*. After inference, all the feature in a given group obtain the p-value of the group representative. When the inference engine is Desparsified Lasso, the resulting method is called Clustered Desparsified lasso, or **CluDL**.
As hinted in :footcite:t:`meinshausen2009pvalues` an efficient way to deal with such configuration is to take the per-group average of the features: this leads to a *reduced design*. After inference, all the feature in a given group obtain the p-value of the group representative. When the inference engine is Desparsified Lasso, the resulting method is called Clustered Desparsified lasso, or :py:func:`CluDL <hidimstat.clustered_inference>`.

I think this links to the API

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other models to cite it in this section; it should be interesting to add the link, for example Desparsified Lasso and EnCluDL.

Copy link
Collaborator

@lionelkusch lionelkusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try to stick to the maximum length of 80 characters?

bthirion and others added 6 commits October 21, 2025 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants