Skip to content

Conversation

@arrjon
Copy link
Member

@arrjon arrjon commented Sep 30, 2025

This pull request introduces a new gradient-based sampler, the Metropolis-Adjusted Langevin Algorithm (MALA), to the pypesto.sample module, and refactors the covariance regularization in the adaptive Metropolis sampler. The changes also enhance the test suite and documentation to demonstrate and validate the new functionality.

I refactored the covariance regularization to return both the Cholesky decomposition and the regularized covariance matrix, improving numerical stability. The regularization function now supports multiple attempts and safe fallback strategies. Before it could happen that adaptive MCMC just fails at some point due to an ill-conditioned proposal covariance.

@arrjon arrjon self-assigned this Sep 30, 2025
@arrjon arrjon requested a review from dilpath as a code owner September 30, 2025 20:09
@arrjon arrjon added enhancement New feature or request sampling Related to sampling labels Sep 30, 2025
@codecov-commenter
Copy link

codecov-commenter commented Sep 30, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 92.13483% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.36%. Comparing base (bc7d89a) to head (7cad7fd).

Files with missing lines Patch % Lines
pypesto/sample/adaptive_metropolis.py 87.87% 4 Missing ⚠️
pypesto/sample/mala.py 93.18% 3 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1617      +/-   ##
===========================================
- Coverage    84.37%   84.36%   -0.01%     
===========================================
  Files          164      165       +1     
  Lines        14320    14395      +75     
===========================================
+ Hits         12082    12145      +63     
- Misses        2238     2250      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@PaulJonasJost PaulJonasJost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, looks good to me, thanks :)

@arrjon arrjon enabled auto-merge October 25, 2025 16:52
@arrjon
Copy link
Member Author

arrjon commented Dec 16, 2025

@dilpath, this is ready to merge I think

Copy link
Member

@dilpath dilpath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I mostly want a bit more documentation for future devs.

Comment on lines +111 to +113
def _propose_parameter(self, x: np.ndarray, beta: float):
x_new: np.ndarray = np.random.multivariate_normal(x, self._cov)
return x_new, np.nan # no gradient needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the return value and return type? Where does a developer see that _propose_parameter returns a 2-tuple where the first entry is the parameter, the second entry something else?

Comment on lines +229 to +232
if not np.all(np.isfinite(cov)):
s = np.nanmean(np.diag(cov))
s = 1.0 if not np.isfinite(s) or s <= 0 else s
return np.sqrt(s) * eye, s * eye
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this a bit more in comments? What does this part achieve?

What does diag do here -- just added to reduce the mean with extra zeroes?

Why is there 1.0 if not np.isfinite(s) or s <= 0 else s -- isn't it already established that there is an infinity inside cov due to not np.all(np.isfinite(cov)), and therefore s is also necessarily infinite or nan?

Comment on lines +219 to 222
L:
Cholesky decomposition of the regularized covariance matrix.
cov:
Regularized estimate of the covariance matrix of the sample.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think returning the covariance first makes more sense, unless you think users/devs might be more interested in the Cholesky decomposition? This method used to only return cov, so having that as the first return value is probably safest?

Comment on lines +224 to +225
# we symmetrize cov first
cov = 0.5 * (cov + cov.T)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary if the input is already a covariance matrix?

Comment on lines +241 to +243
# scale for the initial jitter
s = np.mean(np.diag(cov))
s = 1.0 if not np.isfinite(s) or s <= 0 else s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would s be non-finite here, if the non-finite cov case was handled earlier?

diffusion = np.sqrt(self.step_size) * precond_noise

x_new: np.ndarray = (
x + self._dimension_of_target_weight * drift + diffusion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this formula with dimension_of_target_weight come from?

def _compute_transition_log_prob(
self, x_from: np.ndarray, x_to: np.ndarray, x_from_grad: np.ndarray
):
"""Compute the log probability of transitioning from x_from to x_to with preconditioning."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which equation in which reference provides the math in this method?

# log acceptance probability (temper log posterior)
log_p_acc = min(beta * (lpost_new - lpost), 0)

# This accounts for an asymmetric proposal distribution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference?

Comment on lines +122 to +125
if problem.objective.has_grad:
pytest.skip(
"EMCEE fails to converge when started in optimal point with gradient."
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange -- does it also fail to converge if it ever visits the optimal point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test for MalaSampler to assert that the correct posterior/density is computed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request sampling Related to sampling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants