Skip to content

Multivariable covariant parameter space priors#135

Merged
kayhangultekin merged 8 commits intodevfrom
kg_dev
Mar 20, 2026
Merged

Multivariable covariant parameter space priors#135
kayhangultekin merged 8 commits intodevfrom
kg_dev

Conversation

@kayhangultekin
Copy link
Copy Markdown
Collaborator

@kayhangultekin kayhangultekin commented Mar 6, 2026

Pull Request: Multivariate Priors & Covariant GSMF Implementation

Description

Note the code in this PR and the text here were created with the assistance of AI: Gemini 3 Flash and Gemini 3.1 Pro (Low)

This PR introduces support for multivariate normal prior distributions and implements the Covariant Galaxy Stellar Mass Function (GSMF) using data from Leja+2020.

The core of this update involves extending our parameter space sampling logic to seamlessly handle a mix of univariate and multivariate distributions without breaking the underlying Latin Hypercube continuous sampling.

Todos

Notable points that this PR has either accomplished or will accomplish.

  • Add multivariate normal prior support
  • Add covariance matrix support to parameter spaces

Status

  • Ready to go except that the covariance matrix values are based on approximations to Leja+2020 data. These need to be updated with the real data when they become available!! This can be done with the covariant-double-schechter.ipynb notebook.

Key Features & New Classes

  • PD_MVNormal (New Class): Added to lib_tools.py. This class implements a multivariate normal/Gaussian parameter distribution. It takes a list of parameter names, a means vector, and a covariance matrix. It automatically checks for positive definiteness using Cholesky decomposition to map uniform samples to the multivariate normal space.
  • Covariant GSMF Parameter Spaces: Added hardcoded values for GSMF_COV_NAMES, GSMF_COV_MEANS, and GSMF_COV_MATRIX to param_spaces.py. (The values in GSMF_COV_MATRIX are based on approximations to Leja+2020 data. These need to be updated with the real data when they become available!! This can be done with the covariant-double-schechter.ipynb notebook.)
  • PS_Astro_Strong_Covariant_GSMF & PS_Astro_Strong_Covariant_All: New parameter spaces that utilize the PD_MVNormal class to sample the GSMF parameters covariantly.
  • PS_Test_Astro_Strong_Covariant_MMBulge: A new test parameter space demonstrating how to couple specific parameters (like mmb_mamp_log10 and mmb_plaw) using a covariance matrix while keeping others (like mmb_scatter_dex) independent.

Architectural Changes & Helper Functions

  • Sample Transformation (_transform_samples): Completely rewrote the sample transformation logic in the _Param_Space base class. Previously, it assumed a 1:1 mapping between uniform samples and parameter distributions. It now dynamically tracks and slices the correct number of input dimensions required by each _Param_Dist object (whether 1D or ND).
  • Extrema Property Update: Updated the bounds evaluation (extrema property) to correctly handle 2D stacking of boundaries for multivariate distributions.
  • repair_covariance(m): Added a new helper function to holodeck/utils.py. This function finds the nearest positive semi-definite matrix by applying eigenvalue decomposition and clipping negative eigenvalues. This is critical for preventing Cholesky decomposition failures when dealing with manually estimated (or slightly numerically unstable) covariance matrices.

Testing & Notebooks

  • New Notebook (covariant-double-schechter.ipynb): Added a dedicated notebook in notebooks/devs/sams/ to demonstrate the synthetic dataset generation from the multivariate normal distribution and test the covariance repairing logic.
  • Notebook Updates: Refreshed execution states and plots in double-schechter.ipynb, librarian.ipynb, and semi-analytic-models.ipynb to ensure compatibility with the updated parameter space logic.
  • Local Repo Organization: Added scratch/ directory to .gitignore to keep temporary testing/development scripts untracked.

Kayhan Gultekin added 6 commits December 15, 2025 15:32
…ses _Param_Space and _Param_Dist so that they can tell if they are using a multi-parameter distribution or if they are using a legacy-style parameter distribution. I created a single new multiparameter distribution, a multivariable Gaussian PD_MVNormal. You give it the means and a covariance matrix, and when doing Latin Hypercube Sampling, it will sample appropriately. I added a new test _Param_Space sublcass PS_Test_Astro_Strong_Covariant_MMBulge. I modified librarian.ipynb to showcase the new parameter space and the joint distribution.
double Schechter function parameters based on
Leja+2020. Added a notebook to generate synthetic
data with the covariances estimated from the corner
plot in the paper. Also added a function to repair the
covariance matrix if it's not positive definite. This is
a work in progress and will need to be updated with actual
covariances from the paper or samples.
…t I will remove before committing upstream.
- Removed several test scripts from git index and moved into a new root 'scratch/' directory.
- Ignored 'scratch/' in .gitignore.
- Renamed 'new-covariant_double-schecter.ipynb' to 'covariant_double-schecter.ipynb' and tracked the latter.
@kayhangultekin kayhangultekin added the enhancement New feature or request label Mar 6, 2026
@kayhangultekin kayhangultekin added this to the NG20 Ready milestone Mar 6, 2026
@kayhangultekin
Copy link
Copy Markdown
Collaborator Author

Totally forgot to say that this addresses issue #132

@CayenneMatt
Copy link
Copy Markdown
Collaborator

The additions seem to functioning properly except for two minor compatibility issues that cause some unit tests to fail.

The test _check_ps_basic from holodeck/librarian/tests/test_lib_tools__param_space.py fails the check assert len(params) == nparams. This is because the test was not designed for the distribution classes to have multiple parameters. (line 282 of Run tests and generate coverage report)

The default method of _Param_Dist in holodeck/librarian/lib_tools.py returns self(0.5) which causes if xx.ndim != 2: on line 565 to fail. This occurs when running test_all_param_spaces in holodeck/librarian/tests/test_param_spaces.py. Details are on and around on line 349 of Run tests and generate coverage report.

Finally, the backslashes on lines 552, 553, 566, and 574 of holodeck/librarian/lib_tools.py cause the associated error messages to print the strings literally without filling in the assigned variables.

@kayhangultekin
Copy link
Copy Markdown
Collaborator Author

@CayenneMatt Good notes. I will update the tests so that they properly check to see if the distribution is multidimensional and check for the appropriate number of parameters. I'll also deal with the backslashes properly.

Details:
- Replaced independent parameters with empirical covariance modeling using PD_MVNormal in the Double-Schechter GSMF implementation.
- Fixed a bug where unclipped distributions returned NaN limits; PD_MVNormal now natively handles bounds correctly.
- Updated [normalized_params](cci:1://file://holodeck/librarian/lib_tools.py:467:4-522:21) and loop tests in [test_lib_tools__param_space.py](cci:7://file://holodeck/librarian/tests/test_lib_tools__param_space.py:0:0-0:0) to correctly index distributions with multiple parameters.
- Added a demonstration/tutorial block for PD_MVNormal in [librarian.ipynb](cci:7://file:///holodeck/notebooks/librarian.ipynb:0:0-0:0).

Co-authored-with: Google DeepMind Antigravity (Gemini AI)
@kayhangultekin
Copy link
Copy Markdown
Collaborator Author

PR Addendum: Support for Multivariate Distributions & Test Fixes

This addendum summarizes the additional work completed to resolve issues raised regarding unit test failures for multivariate parameter distributions, as well as several robustness improvements and documentation updates discovered during the implementation.

🚨 I think this is now RTG for merge into dev.

1. Resolution of Unit Test Failures (Addressing @CayenneMatt's comment)

  • Normalized Parameter Mapping: Fixed a core assumption in lib_tools.py where normalized_params assumed a 1-to-1 mapping between scalar indices and distribution objects. It now utilizes a param_to_dist mapping to correctly handle multivariate distributions (like PD_MVNormal that correspond to multiple scalar parameters.
  • Test Suite Updates:
    • Updated test_lib_tools__param_space.py: Refactored _check_ps_basics to correctly iterate using the distribution mapping rather than raw scalar counts.
    • Fixed handling of multivariate defaults in tests to ensure sliced vector inputs are passed correctly to the underlying distributions.
  • Validation: Verified all tests pass in holodeck/librarian/tests/ (including test_lib_tools__param_space.py and test_param_spaces.py).

2. Implementation of Covariant GSMF (Leja 2020)

  • Implemented PS_Astro_Strong_Covariant_GSMF in param_spaces.py.
  • This uses the new PD_MVNormal class to incorporate the empirical covariance matrix and means from Leja 2020 for the Double-Schechter GSMF anchor points, replacing previously independent parameter estimates with a physically correlated model.

3. Bug Fix: NaN Extrema in PD_MVNormal

  • Identified and fixed a bug where unclipped multivariate normal distributions returned [nan, nan] limits. This was caused by the base class performing mathematical inf - inf operations during limit propagation.
  • Solution: Added a specific extrema override to PD_MVNormal that natively returns [-np.inf, np.inf] when no clipping is provided, ensuring correct string representation and boundary logic in the _Param_Space class.

4. Documentation & Tutorial Updates

  • Librarian Notebook: Added a new tutorial section to notebooks/librarian.ipynb specifically for PD_MVNormal. It demonstrates how to initialize the distribution with a covariance matrix and visualize the resulting coupled parameter samples.

Note: This work was completed with the assistance of AI (Google DeepMind Antigravity/Gemini).

Copy link
Copy Markdown
Collaborator

@CayenneMatt CayenneMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This passes the relevant unit tests and a couple tests I ran locally, seems to be working!

@kayhangultekin kayhangultekin linked an issue Mar 20, 2026 that may be closed by this pull request
@kayhangultekin kayhangultekin merged commit acb07f1 into dev Mar 20, 2026
8 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(JointPriorDistributions): Enable joint prior distributions

2 participants