Skip to content

Conversation

@maresb
Copy link
Contributor

@maresb maresb commented Dec 15, 2025

Description

Disclosure: With assistance from claude-4.5-opus-high via Cursor.

This PR adds a _logccdf (log complementary CDF / log survival function) dispatcher to fix numerical instability when computing log-probabilities for censored distributions at extreme tail values.

The Problem:
For right-censored distributions, the log-probability at the upper bound requires computing log(1 - CDF). The existing implementation uses log1mexp(logcdf), which breaks down when CDF ≈ 1 (far right tail). For example, a censored Normal(0, 1) at 40 standard deviations returns -inf instead of the correct ≈ -804.6.

The Solution:
Add a _logccdf dispatcher that allows distributions to provide a numerically stable log survival function. For Normal, this uses the existing normal_lccdf function (based on erfcx) which is stable across the entire domain.

Changes:

  • pymc/logprob/abstract.py: Add _logccdf singledispatch and _logccdf_helper
  • pymc/distributions/distribution.py: Register logccdf methods via metaclass
  • pymc/distributions/continuous.py: Add logccdf to Normal using stable normal_lccdf
  • pymc/logprob/censoring.py: Use _logccdf for right-censored distributions when available
  • pymc/logprob/binary.py: Use _logccdf for comparison operations
  • pymc/logprob/transforms.py: Use _logccdf_helper for monotonic transforms
  • pymc/logprob/basic.py: Add public logccdf() function
  • pymc/logprob/__init__.py: Export logccdf

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

Test that pm.Censored computes log-probabilities stably at the bounds:
- Right censoring (upper bound): log(1 - CDF) when CDF ≈ 1
- Left censoring (lower bound): log(CDF) when CDF ≈ 0

Uses pm.Censored with Normal(0, 1) at ±40 standard deviations.
Add _logccdf (log complementary CDF / log survival function) support:

- pymc/logprob/abstract.py: Add _logccdf singledispatch and _logccdf_helper
- pymc/distributions/distribution.py: Register logccdf methods via metaclass
- pymc/distributions/continuous.py: Add logccdf to Normal using stable normal_lccdf
- pymc/logprob/censoring.py: Use _logccdf for right-censored distributions
- pymc/logprob/binary.py: Use _logccdf for comparison operations
- pymc/logprob/transforms.py: Use _logccdf_helper for monotonic transforms
- pymc/logprob/basic.py: Add public logccdf() function
- pymc/logprob/__init__.py: Export logccdf

This fixes numerical instability when computing log-probabilities for
censored Normal distributions at extreme tail values (e.g., 10+ sigma).
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a _logccdf dispatcher to provide numerically stable log survival function (log complementary CDF) computations for censored distributions and other operations that require computing log(1 - CDF). The implementation fixes numerical instability issues when evaluating at extreme tail values, such as computing the log-probability of a censored Normal(0,1) distribution at 40 standard deviations, which previously returned -inf instead of the correct value.

Key Changes

  • Added _logccdf singledispatch function and _logccdf_helper for numerically stable log survival function computation
  • Implemented logccdf method for the Normal distribution using the existing normal_lccdf function
  • Updated censoring, binary comparison, and transform modules to use _logccdf when available, with graceful fallback to pt.log1mexp(logcdf)

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pymc/logprob/abstract.py Added _logccdf dispatcher and _logccdf_helper following the same pattern as _logcdf
pymc/distributions/distribution.py Added metaclass registration for logccdf methods to automatically dispatch to distribution-specific implementations
pymc/distributions/continuous.py Implemented logccdf for Normal distribution using stable normal_lccdf function
pymc/logprob/censoring.py Updated right-censored distribution logic to use _logccdf with fallback for numerical stability
pymc/logprob/binary.py Updated comparison operations to use _logccdf for improved numerical stability
pymc/logprob/transforms.py Updated monotonic transforms to use _logccdf_helper for continuous distributions
pymc/logprob/basic.py Added public logccdf() function with comprehensive documentation and examples
pymc/logprob/__init__.py Exported logccdf as part of the public API
tests/logprob/test_censoring.py Added parametrized tests for numerical stability of censored distributions at extreme tail values
tests/logprob/test_abstract.py Added comprehensive tests for _logccdf_helper, public logccdf function, and numerical stability

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

This test uses pm.Censored which is the high-level API for censored distributions.
"""
import pymc as pm
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module 'pymc' is imported with both 'import' and 'import from'.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Moved import pymc as pm to top-level imports.

19b9979

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this would always be needed eventually. I left a request. Also can we use this in the Truncated, or it doesn't show up there?

@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.47%. Comparing base (cadb97a) to head (9444212).
⚠️ Report is 21 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #7996      +/-   ##
==========================================
+ Coverage   90.22%   91.47%   +1.25%     
==========================================
  Files         116      116              
  Lines       18972    19065      +93     
==========================================
+ Hits        17117    17440     +323     
+ Misses       1855     1625     -230     
Files with missing lines Coverage Δ
pymc/distributions/continuous.py 98.26% <100.00%> (+0.37%) ⬆️
pymc/distributions/distribution.py 94.50% <100.00%> (+0.13%) ⬆️
pymc/distributions/truncated.py 99.46% <100.00%> (+0.01%) ⬆️
pymc/logprob/__init__.py 100.00% <ø> (ø)
pymc/logprob/abstract.py 97.46% <100.00%> (+0.40%) ⬆️
pymc/logprob/basic.py 100.00% <100.00%> (ø)
pymc/logprob/binary.py 96.34% <100.00%> (ø)
pymc/logprob/censoring.py 97.67% <100.00%> (ø)
pymc/logprob/transforms.py 95.64% <100.00%> (+0.20%) ⬆️

... and 17 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Centralizes the fallback logic so callers don't need to handle it.
The helper now tries the stable _logccdf first and automatically
falls back to log1mexp(logcdf) if not implemented.
Uses stable logccdf for computing log(1 - CDF(lower)) in truncated_logprob
and truncated_logcdf instead of the potentially unstable log1mexp(logcdf).
The construct_ir_fgraph returns a single FunctionGraph, not a tuple.
Extract ir_valued_rv from outputs and unpack ir_rv and ir_value from inputs.
Move 'import pymc as pm' to top of file with other imports
instead of inside the test function.
@maresb
Copy link
Contributor Author

maresb commented Dec 15, 2025

Yeah this would always be needed eventually. I left a request. Also can we use this in the Truncated, or it doesn't show up there?

Yes! Truncated computes log(1 - CDF(lower)) via pt.log1mexp(lower_logcdf). Added _create_lower_logccdf_expr helper to TruncatedRV and updated both truncated_logprob and truncated_logcdf to use stable logccdf.

15806c0

@maresb
Copy link
Contributor Author

maresb commented Dec 15, 2025

Working on test coverage...

@ricardoV94
Copy link
Member

I'll only be able to review later, but the general picture seems great. How urgent is this to get merged?

@maresb
Copy link
Contributor Author

maresb commented Dec 15, 2025

Not super urgent since we have a workaround: brendanjmeade/celeri#343

maresb added a commit to maresb/celeri that referenced this pull request Dec 15, 2025
…tributions"

This reverts commit 3edaae2.

This will be safe once celeri requires a release of PyMC where
pymc-devs/pymc#7996 has been merged.
The test_logccdf_numerical_stability test already covers this functionality
at the public API level. The helper test was redundant since it tested the
same numerical stability property through _logccdf_helper.
Verifies that distributions without a registered _logccdf method
(e.g., Uniform) use the log1mexp(logcdf) fallback, while distributions
with _logccdf (e.g., Normal) use their specialized implementation.

The test inspects the computation graph structure rather than just
numerical results to ensure the correct code path is exercised.
Increase test bounds from 10/40 to 100 sigma to future-proof against
any potential improvements in naive computation methods. At 100 sigma,
CDF(100) is truly indistinguishable from 1.0 in float64.

Also enhances test docstrings with What/Why/How documentation.
Add detailed docstrings to logccdf tests explaining:
- What: what the test verifies
- Why: motivation and edge cases being tested
- How: the testing methodology
Tests that pm.logccdf works when the random variable depends on
transformed parameters, triggering the construct_ir_fgraph fallback
path in the public logccdf function.
Tests that logccdf registration works correctly for custom Distribution
subclasses using SymbolicRandomVariable with extended_signature, which
exercises the params_idxs code path in DistributionMeta.
@maresb maresb force-pushed the log-ccdf-stability branch from c53ff15 to 5a2f7cf Compare January 6, 2026 08:57
Copy link
Contributor Author

@maresb maresb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, sorry about that. In this case I explicitly requested it to be super verbose in the test docstrings because I found it really hard to understand precisely what the tests were intended for and how they work.

Is it better with my latest changes? Anything else you'd like? Would you like me to rebase and squash the redundant commits?

@ricardoV94
Copy link
Member

Yeah much better. I left some more comments, about placement of tests and some that seem redundant. Mind doing one more pass?

@maresb
Copy link
Contributor Author

maresb commented Jan 6, 2026

Thanks! I've tried to address all comments.

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last nitpick on my side.

@ricardoV94
Copy link
Member

Failing test seems unrelated

@ricardoV94 ricardoV94 changed the title Add _logccdf dispatcher for numerically stable log survival function in censored distributions Implement logccdf helper for numerically stable log survival function Jan 6, 2026
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maresb this is a great improvement

@maresb
Copy link
Contributor Author

maresb commented Jan 7, 2026

Thanks @ricardoV94 for the very careful review! How would you recommend I merge? Squash-merge?

@ricardoV94 ricardoV94 merged commit 689f736 into pymc-devs:main Jan 7, 2026
74 of 76 checks passed
@ricardoV94
Copy link
Member

ricardoV94 commented Jan 7, 2026

Yeah I squash merged

maresb added a commit to maresb/celeri that referenced this pull request Jan 7, 2026
…tributions"

This reverts commit 3edaae2.

This will be safe once celeri requires a release of PyMC where
pymc-devs/pymc#7996 has been merged.
maresb added a commit to maresb/celeri that referenced this pull request Jan 7, 2026
This contains an essential numerical stability fix:
pymc-devs/pymc#7996
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants