Skip to content

BUG: [Regression] Shape / dimension issue when using MVNormal.dist inside pm.logp introduced in v5.16.1 #7602

@jonsedar

Description

@jonsedar

Describe the issue:

Regression error: something changed between v5.16.0 and v5.16.1 in MVNormal. I hope someone can help!

I have a copula model architecture that involves evidencing a transformed data input against a latent RV using a Potential. The crux of this issue is in this snippet:

...

# 3. Transformation path pt2: Uniform -> Normal via Normal InvCDF
n_d = pm.Normal.dist(mu=0., sigma=1., shape=(N, 2))
c = pm.Deterministic("c", pm.icdf(n_d, u), dims=("oid", "c_nm"))

# 4. Create Latent Copula dist using a 2D MvNormal
sd = pm.InverseGamma.dist(alpha=5.0, beta=4.0)
chol, corr_, stds_ = pm.LKJCholeskyCov("lkjcc", n=2, eta=2, sd_dist=sd)
c_d = pm.MvNormal.dist(mu=pt.zeros(2), chol=chol, shape=(N, 2))

# 5. Evidence transformed C against Latent Copula using Potential
_ = pm.Potential("pot_chat", pm.logp(c_d, c, warn_rvs=True), dims=("oid", "chat_nm"))

...
> ValueError: pot_chat has 1 dims but 2 dim labels were provided.

This used to work fine in v5.16.0, but now in v5.16.1 the shape / dims cause an issue (something in this diff must be casuing it v5.16.0...v5.16.1)

For more details please see Model A in the MRE in this gist (https://gist.github.com/jonsedar/a2c355df5a8c888768dbec7e1fe1f7a6), also in this working Notebook in Google Colab (https://colab.research.google.com/drive/1PHL5Xxsw-0sD08qyJomAo5iHvU-XHoKc?usp=sharing)

In Model B in the same notebook I've replaced the latent copula dist with a pm.Normal, and that works fine, no shape issues!

c_d = pm.Normal.dist(mu=pt.zeros(2), sigma=pt.diag(corr_[::-1]), shape=(N, 2))

So I suspect something's up with pm.MvNormal and logp, but while I dig into that code that I barely understand, possibly someone more knowledgeable could chime in with some thoughts?

Cheers!

[UPDATE] I remembered that you can specify the package version in Colab, this is an awesome way of bug-hunting! If you force install v5.16.0, then the shape issue for Model A doesn't appear. if you force install v5.16.1, then the shape issue for Model A does appear.

# uncomment to install in a Google Colab environment
!pip install watermark pymc==5.16.0*  # most recent working version

Reproduceable code example:

For more details please see Model A in the MRE in this gist (https://gist.github.com/jonsedar/a2c355df5a8c888768dbec7e1fe1f7a6)

Also in this working Notebook in Google Colab  (https://colab.research.google.com/drive/1PHL5Xxsw-0sD08qyJomAo5iHvU-XHoKc?usp=sharing)

Error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 38
     32 c_d = pm.MvNormal.dist(mu=pt.zeros(2), chol=chol, shape=(N, 2))
     35 # 6. Evidence transformed C against Latent Copula using Potential
     36 #  because pymc TypeError: Variables that depend on other nodes
     37 #  cannot be used for observed data (c)
---> 38 _ = pm.Potential("pot_chat", pm.logp(c_d, c, warn_rvs=True), dims=("oid", "chat_nm"))

File ~/miniforge/envs/oreum_copula/lib/python3.11/site-packages/pymc/model/core.py:2403, in Potential(name, var, model, dims)
   2401 var.name = model.name_for(name)
   2402 model.potentials.append(var)
-> 2403 model.add_named_variable(var, dims)
   2405 from pymc.printing import str_for_potential_or_deterministic
   2407 var.str_repr = types.MethodType(
   2408     functools.partial(str_for_potential_or_deterministic, dist_name="Potential"), var
   2409 )

File ~/miniforge/envs/oreum_copula/lib/python3.11/site-packages/pymc/model/core.py:1537, in Model.add_named_variable(self, var, dims)
   1535     # This check implicitly states that only vars with .ndim attribute can have dims
   1536     if var.ndim != len(dims):
-> 1537         raise ValueError(
   1538             f"{var} has {var.ndim} dims but {len(dims)} dim labels were provided."
   1539         )
   1540     self.named_vars_to_dims[var.name] = dims
   1542 self.named_vars[var.name] = var

ValueError: pot_chat has 1 dims but 2 dim labels were provided.

PyMC version information:

%load_ext watermark
%watermark -n -u -v -iv -w
Last updated: Tue Dec 03 2024

Python implementation: CPython
Python version       : 3.10.12
IPython version      : 7.34.0

numpy   : 1.26.4
pymc    : 5.16.1
pytensor: 2.23.0

Watermark: 2.5.0

Context for the issue:

This is a regression issue and previously working code no longer works in pymc > v5.16.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions