Skip to content

BUG: infs or NaNs when performing logistic regression on the MNIST dataset. #7515

@ylefay

Description

@ylefay

Describe the issue:

Thanks a lot for your work on this library.

I am trying to perform a logistic regression using ADVI implementation of PyMC on the MNIST digit dataset.
However, it fails because of NaN of inf values.
The CSV file containing the dataset is available at: https://drive.google.com/file/d/1eEKzfmEu6WKdRlohBQiqi3PhW_uIVJVP/view
(source: https://git-disl.github.io/GTDLBench/datasets/mnist_datasets/).

Reproduceable code example:

import numpy as np
import pandas as pd
import pymc as pm

# Performing preprocessing on the MNIST dataset: selecting the digits 0 and 8, normalizing the features and deleting the rows full of zeros.
def mnist_dataset():
    def preprocess(data):
        idx = (data[:, 0] == 0) + (data[:, 0] == 8)
        data = data[idx]
        idx = data[:, 0] == 8
        data[idx, 0] = 1
        data[~idx, 0] = 0
        labels = data[:, 0]
        data = data[:, 1:]
        data = data / 255
        data = np.delete(data, np.where(np.product(data == 0, axis=0) == 1), axis=-1)
        return data, labels.astype(dtype=float)

    mnist_train = np.array(pd.read_csv("./mnist/mnist_train.csv", header=None))
    mnist_train = preprocess(mnist_train)
    return mnist_train

flipped_predictors, response = mnist_dataset()
dim = flipped_predictors.shape[1]
with pm.Model() as logistic_model:
    cov = np.identity(dim) * 25
    beta = pm.MvNormal('beta', mu=np.zeros(dim), cov=cov)
    logit_theta = pm.Deterministic('logit_theta', flipped_predictors @ beta)
    y = pm.Bernoulli("y", logit_p=logit_theta, observed=response)
with logistic_model:
    callback = pm.variational.callbacks.CheckParametersConvergence(diff='absolute')
    start_means = {'beta': np.zeros(dim)}
    start_sigma = {'beta': np.ones(dim)}
    approx = pm.fit(n=100, callbacks=[callback], start=start_means,
                    start_sigma=start_sigma)

Error message:

<details>
/home/user/LSVI/venv/bin/python /home/user/LSVI/LSVI/experiments/logisticRegression/mnist/script.py 
Fitting: ━━━━━━━━━                             25% 0:00:03 Average Loss = 43,696
Traceback (most recent call last):
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pytensor/compile/function/types.py", line 959, in __call__
    self.vm()
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pytensor/graph/op.py", line 524, in rval
    r = p(n, [x[0] for x in i], o)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pytensor/tensor/slinalg.py", line 298, in perform
    outputs[0][0] = scipy.linalg.solve_triangular(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/scipy/linalg/_basic.py", line 334, in solve_triangular
    b1 = _asarray_validated(b, check_finite=check_finite)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/scipy/_lib/_util.py", line 321, in _asarray_validated
    a = toarray(a)
        ^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/numpy/lib/function_base.py", line 630, in asarray_chkfinite
    raise ValueError(
ValueError: array must not contain infs or NaNs

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/LSVI/LSVI/experiments/logisticRegression/mnist/script.py", line 37, in <module>
    approx = pm.fit(n=100, callbacks=[callback], start=start_means,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pymc/variational/inference.py", line 775, in fit
    return inference.fit(n, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pymc/variational/inference.py", line 155, in fit
    state, means, covs = self._iterate_with_loss(
                         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pymc/variational/inference.py", line 233, in _iterate_with_loss
    e = step_func()
        ^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pytensor/compile/function/types.py", line 972, in __call__
    raise_with_op(
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pytensor/link/utils.py", line 524, in raise_with_op
    raise exc_value.with_traceback(exc_trace)
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pytensor/compile/function/types.py", line 959, in __call__
    self.vm()
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pytensor/graph/op.py", line 524, in rval
    r = p(n, [x[0] for x in i], o)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/pytensor/tensor/slinalg.py", line 298, in perform
    outputs[0][0] = scipy.linalg.solve_triangular(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/scipy/linalg/_basic.py", line 334, in solve_triangular
    b1 = _asarray_validated(b, check_finite=check_finite)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/scipy/_lib/_util.py", line 321, in _asarray_validated
    a = toarray(a)
        ^^^^^^^^^^
  File "/home/user/LSVI/venv/lib/python3.12/site-packages/numpy/lib/function_base.py", line 630, in asarray_chkfinite
    raise ValueError(
ValueError: array must not contain infs or NaNs
Apply node that caused the error: SolveTriangular{trans=0, unit_diagonal=False, lower=True, check_finite=True, b_ndim=2}([[5. 0. 0. ... 0. 0. 5.]], ExpandDims{axis=1}.0)
Toposort index: 31
Inputs types: [TensorType(float64, shape=(784, 784)), TensorType(float64, shape=(784, 1))]
Inputs shapes: [(784, 784), (784, 1)]
Inputs strides: [(8, 6272), (8, 8)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Transpose{axes=[1, 0]}(SolveTriangular{trans=0, unit_diagonal=False, lower=True, check_finite=True, b_ndim=2}.0), SolveTriangular{trans=0, unit_diagonal=False, lower=False, check_finite=True, b_ndim=2}([[5. 0. 0. ... 0. 0. 5.]], SolveTriangular{trans=0, unit_diagonal=False, lower=True, check_finite=True, b_ndim=2}.0)]]

HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
</details>

PyMC version information:

pymc==5.16.2, installed via pip, using python 3.12 on arch-linux

Context for the issue:

This is part of academic work on variational inference.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions