Numpy 2 updates #4

brendan-m-murphy · 2025-02-04T14:21:14Z

Description

This PR makes PyTensor compatible with numpy versions >= 1.26 and < 2.2. (Numba requires numpy < 2.2.)

These changes include:

Unpinning numpy in environment files and pyproject
Adding a numpy 1.26 test job to CI
- runs on python 3.10, float32 0, fast-compile 0
- skips doctests, since it doesn't seem to easy to make these conditional on numpy version, and the numpy repr for numerical types has changed (e.g. in numpy < 2.0, 3 will print, but in numpy >= 2.0, np.int64(3) will print)
General numpy deprecations and namespace changes:
- updated imports due to changed name spaces (e.g. numpy.core is now numpy._core, and many functions have been moved from core to new public locations like numpy.lib; also AxisError needs to be imported from numpy.exceptions now). These changes are conditional on numpy version number, except the AxisError import, which is compatible with numpy 1.26.
- deprecations: the main change is that np.cast is deprecated; its replacement is np.asarray(..., dtype=...)
The return value of the inverse indices from np.unique has changed when axis is None; this required changes in the Unique Op. (This change is conditional on numpy version number.)
Changes in how overflows and type conversions are handled: to explicitly change the type of a numpy array, you must use .astype. Conversions are no longer handled automatically; for instance np.asarray(-1, dtype="unit8") will raise an OverflowError.
- TensorType.filter uses this new conversion method if allow_downcast is true, which preserves the existing behavior
- Several tests have been changes to either expect OverflowErrors (for numpy >= 2.0, or TypeError for numpy < 2.0), or use equivalent but valid values (e.g. using 255 for a uint8, instead of -1).
Changes to python type promotion (NEP 50)
- NEP 50 outlines changes in how python types are compared with and converted to numpy types. These changes were optional before numpy 2.0, but are now default. Essentially, the new rule is: if a python float is used in an operation with a numpy float, the type of the numpy float will always be used.
- The NumpyAutocaster has been changes to explicitly convert values to numpy types using np.asarray, which preserves the existing behavior. (The reason this preserves the behavior is that this is how the comparison is done in TensorType.filter, where np.asarray(data) is compared to converted_data = np.asarray(data, self.dtype).)
Changes to random number generators:
- The numpy PR numpy/numpy@44ba7ca
  changed methods of numpy.random.Generator that are used by copy and pickle.
- To get a copy of a numpy Generator with independent state, you must use deepcopy now, instead of copy
- This PR also changed the return value of Generator.__getstate__() to None. To get the state now, you must use Generator.bit_generator.state.
Changes due to changes in the numpy C-API
- Some minor changes with straightforward updates, e.g. replace ->elsize by PyArray_ITEMSIZE
- Changes to complex scalars in ScalarType. The numpy implementation of complex numbers has been changed from a struct with real and imaginary values to the native C-99 complex types. On disk, these are equivalent, but the real and imaginary parts C-99 complex types cannot be accessed using pointers. Numpy provides some macros to make accessing real and imaginary parts uniform across pre and post 2.0 version. Since these are implemented in terms of the types npy_cfloat, npy_cdouble, npy_clongdouble, some generic functions were added to the C code so that we do not need to explicitly translation bit size aliases ilke npy_complex64 to these types.
- The constant np.MAXDIMS was removed from the public API. This value was a common flag used to indicate that axis=None has been passed. Now there is an explicitly flag NPY_RAVEL_AXIS. Implementing this change was a bit variable across the affected code. A compatibility header was added to pytensor/npy_2_compat.py to make NPY_RAVEL_AXIS available for numpy < 2.0.
- MapIter was removed from the public numpy C-API, so it was not possible to adapt the C-code for AdvancedIncSubtensor1; instead a NotImplementedError is raised, so this Op defaults to the python implementation, which uses np.add.at.
Dropped support for Python 2 in C code.
- Numpy is starting to remove Python 2 compatibility code (from npy_3k_compat.h): BUILD: clean out py2 stuff from npy_3kcompat.h numpy/numpy#26842
- I have updated the lazylinker C code to use Python 3 exclusively.

Related Issue

Closes Test on numpy 2.0 pymc-devs/pytensor#688
Related to Numpy 2.0 pymc-devs/pytensor#689 (Draft PR with discussion)

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Type of change

📚 Documentation preview 📚: https://pytensor--4.org.readthedocs.build/en/4/

…#1118) * Split and inverse * PyTorch inline constants in dispatch to avoid graph breaks

Also cleans up implementation and documentation

Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.12.2 to 1.12.4. - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](pypa/gh-action-pypi-publish@v1.12.2...v1.12.4) --- updated-dependencies: - dependency-name: pypa/gh-action-pypi-publish dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]>

Also remove bad default values

…output gradients

- replaced np.AxisError with np.exceptions.AxisError - the `numpy.core` submodule has been renamed to `numpy._core` - some parts of `numpy.core` have been moved to `numpy.lib.array_utils` Except for `AxisError`, the updated imports are conditional on the version of numpy, so the imports should work for numpy >= 1.26. The conditional imports have been added to `npy_2_compat.py`, so the imports elsewhere are unconditonal.

- Replace np.cast with np.asarray: in numpy 2.0, `np.cast[new_dtype](arr)` is deprecated. The literal replacement is `np.asarray(arr, dtype=new_dtype)`. - Replace np.sctype2char and np.obj2sctype. Added try/except to handle change in behavior of `np.dtype` - Replace np.find_common_type with np.result_type Further changes to `TensorType`: TensorType.dtype must be a string, so the code has been changed from `self.dtype = np.dtype(dtype).type`, where the right-hand side is of type `np.generic`, to `self.dtype = str(np.dtype(dtype))`, where the right-hand side is a string that satisfies: `self.dtype == str(np.dtype(self.dtype))` This doesn't change the behavior of `np.array(..., dtype=self.dtype)` etc.

Some macros were removed from npy_3k_compat.h. Following numpy, I updated the affected functions to the Python 3 names, and removed support for Python 2. Also updated lazylinker_c version to indicate substantial changes to the C code.

- replace `->elsize` by `PyArray_ITEMSIZE` - don't use deprecated PyArray_MoveInto

Anything `Hashable` should work, but I've made the return type `tuple[Hashable]` to keep with the current style. This means, e.g., we can use strings in the cache version.

This is done using C++ generic functions to get/set the real/imag parts of complex numbers. This gives us an easy way to support Numpy v < 2.0, and allows the type underlying the bit width types, like pytensor_complex128, to be correctly inferred from the numpy complex types they inherit from. Updated pytensor_complex struct to use get/set real/imag aliases defined above. Also updated operators such as `Abs` to use get_real, get_imag. Macros have been added to ensure compatibility with numpy < 2.0 Note: redefining the complex arithmetic here means that we aren't treating NaNs and infinities as carefully as the C99 standard suggets (see Appendix G of the standard). The code has been like this since it was added to Theano, so we're keeping the existing behavior.

MapIter was removed from the public numpy C-API in version 2.0, so we raise a not implemented error to default to the python code for the AdvancedInSubtensor1. The python version, defined in `AdvancedInSubtensor1.perform` calls `np.add.at`, which uses `MapIter` behind the scenes. There is active development on Numpy to improve the efficiency of `np.add.at`. To skip the C implementation and use the Python implementation, we raise a NotImplementedError for this op's c code if numpy>=2.0.

This was done for the python linker and numba linker. deepcopy seems to be the recommended method for copying a numpy Generator. After this numpy PR: numpy/numpy@44ba7ca `copy` didn't seem to actually make an independent copy of the `np.random.Generator` objects spawned by `RandomStream`. This was causing the "test values" computed by e.g. `RandomStream.uniform` to increment the RNG state, which was causing tests that rely on `RandomStream` to fail. Here is some related discussion: numpy/numpy#24086 I didn't see any official documentation about a change in numpy that would make copy stop working.

numpy.random.Generator.__getstate__() now returns none; to see the state of the bit generator, you need to use Generator.bit_generator.state. This change affects `RandomGeneratorType`, and several of the random tests (including some for Jax.)

`np.MAXDIMS` was removed from the public API and no replacement is given in the migration docs. In numpy <= 1.26, the value of `np.MAXDIMS` was 32. This was often used as a flag to mean `axis=None`. In numpy >= 2.0, the maximum number of dims of an array has been increased to 64; simultaneously, a constant `NPY_RAVEL_AXIS` was added to the C-API to indicate that `axis=None`. In most cases, the use of `np.MAXDIMS` to check for `axis=None` can be replaced by the new constant `NPY_RAVEL_AXIS`. To make this constant accessible when using numpy <= 1.26, I added a function to insert `npy_2_compat.h` into the support code for the affected ops.

In numpy 2.0, -1 as uint8 is out of bounds, whereas previously it would be converted to 255. This affected the test helper function `reduced_bitwise_and`. The helper function was changed to use 255 instead of -1 if the dtype was uint8, since this is what is needed to match the behavior of the "bitwise and" op. `reduced_bitwise_and` was only used by `TestCAReduce` in `tests/tensor/test_elemwise.py`, so it was moved there from `tests/tensor/test_math.py`

1. Changed autocaster due to new promotion rules With "weak promotion" of python types in Numpy 2.0, the statement `1.1 == np.asarray(1.1).astype('float32')` is True, whereas in Numpy 1.26, it was false. However, in numpy 1.26, `1.1 == np.asarray([1.1]).astype('float32')` was true, so the scalar behavior and array behavior are the same in Numpy 2.0, while they were different in numpy 1.26. Essentially, in Numpy 2.0, if python floats are used in operations with numpy floats or arrays, then the type of the numpy object will be used (i.e. the python value will be treated as the type of the numpy objects). To preserve the behavior of `NumpyAutocaster` from numpy <= 1.26, I've added an explicit conversion of the value to be converted to a numpy type using `np.asarray` during the check that decides what dtype to cast to. 2. Updates due to new numpy conversion rules for out-of-bounds python ints In numpy 2.0, out of bounds python ints will not be automatically converted, and will raise an `OverflowError` instead. For instance, converting 255 to int8 will raise an error, instead of returning -1. To explicitly force conversion, we must use `np.asarray(value).astype(dtype)`, rather than `np.asarray(value, dtype=dtype)`. The code in `TensorType.filter` has been changed to the new recommended way to downcast, and the error type caught by some tests has been changed to OverflowError from TypeError

I was getting a NameError from the list comprehensions saying that e.g. `pytensor_scalar` was not defined. I'm not sure why, but this is another (more verbose) way to do the same thing.

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

In numpy 2.0, if axis=None, then np.unique does not flatten the inverse indices returned if return_inverse=True A helper function has been added to npy_2_compat.py to mimic the output of `np.unique` from version of numpy before 2.0

Due to changes in numpy conversion rules (NEP 50), overflows are not ignored; in particular, negating a unsigned int causes an overflow error. The test for `neg` has been changed to check that this error is raised.

I split this test up to test uint64 separately, since this is the case discussed in Issue pymc-devs#770. I also added a test for the exact example used in that issue. The uint dtypes with lower precision should pass. The uint64 case started passing for me locally on Mac OSX, but still fails on CI. I'm not sure why this is, but at least the test will be more specific now if it fails in the future.

Also added ruff numpy2 transition rule.

Remaining tests now run on latest numpy, except for Numba jobs, which need numpy 2.1.0

brendan-m-murphy force-pushed the numpy-2-updates branch 6 times, most recently from 0166c25 to acac516 Compare February 5, 2025 13:50

brendan-m-murphy marked this pull request as ready for review February 5, 2025 14:31

brendan-m-murphy marked this pull request as draft February 5, 2025 14:38

Remove accidental print statements

17748b7

brendan-m-murphy marked this pull request as ready for review February 6, 2025 09:45

brendan-m-murphy force-pushed the numpy-2-updates branch from 7ad30ee to 631f397 Compare February 6, 2025 11:05

ricardoV94 and others added 10 commits February 9, 2025 17:05

PyTorch inline constants in dispatch to avoid graph breaks (pymc-devs…

4fa9bb8

…#1118) * Split and inverse * PyTorch inline constants in dispatch to avoid graph breaks

Remove unnecessary type ignore in new version of mypy

da4960b

Implement gradient for vector repetitions

ffdde1c

Also cleans up implementation and documentation

Deprecate Chi2SF ScalarOp

60c2d92

Remove unused ScalarOp.st_impl

0b07727

Reduce overhead of Scalar python implementation

0b94be0

More direct access to special functions

7411a08

Faster python implementation of MvNormal

2823dfc

Also remove bad default values

Allow decomposition methods in MvNormal

2aecb95

brendan-m-murphy force-pushed the numpy-2-updates branch 5 times, most recently from e7b728f to ecfd045 Compare February 14, 2025 16:53

ricardoV94 added 4 commits February 17, 2025 07:03

Remove global RTOl and ATOL in test file

298bb13

Cleanup Rop tests and fix Max Rop implementation

49cf9d2

Fix bug when taking the L_op of a Scan with mit-mot and disconnected …

4aea87c

…output gradients

Handle Scan gradients of non shaped disconnected inputs

84c7802

ricardoV94 added 6 commits February 17, 2025 11:56

Cache sub-type of DimShuffle

fe8804f

Make reshape ndim keyword only

947b940

Fix bug in local_useless_reshape

141307f

Specify reshape shape length if unknown

02545ed

Refactor reshape + dimshuffle rewrites

dbf5f38

Canonicalize squeeze out of reshape and specialize back

65b96c1

brendan-m-murphy force-pushed the numpy-2-updates branch from ecfd045 to b081c3a Compare February 17, 2025 11:37

Only do reshapes in tensordot when needed

8e5e8a4

brendan-m-murphy force-pushed the numpy-2-updates branch from b081c3a to 4dcaab3 Compare February 17, 2025 14:07

jessegrabowski and others added 20 commits February 17, 2025 15:26

Implement numba dispatch for all linalg.solve modes

bbe663d

Updated lazylinker C code

910b27c

Some macros were removed from npy_3k_compat.h. Following numpy, I updated the affected functions to the Python 3 names, and removed support for Python 2. Also updated lazylinker_c version to indicate substantial changes to the C code.

Changes for deprecations in numpy 2.0 C-API

92d96ff

- replace `->elsize` by `PyArray_ITEMSIZE` - don't use deprecated PyArray_MoveInto

Update type hint for c_code_cache_version

b20f401

Anything `Hashable` should work, but I've made the return type `tuple[Hashable]` to keep with the current style. This means, e.g., we can use strings in the cache version.

Fix for NameError in test

bce3613

I was getting a NameError from the list comprehensions saying that e.g. `pytensor_scalar` was not defined. I'm not sure why, but this is another (more verbose) way to do the same thing.

Updated doctests

45c3a01

From numpy PR numpy/numpy#22449, the repr of scalar values has changed, e.g. from "1" to "np.int64(1)", which caused two doctests to fail.

Preserve numpy < 2.0 Unique inverse output shape

2bfe6dd

In numpy 2.0, if axis=None, then np.unique does not flatten the inverse indices returned if return_inverse=True A helper function has been added to npy_2_compat.py to mimic the output of `np.unique` from version of numpy before 2.0

Fix test for neg on unsigned

cd75f95

Due to changes in numpy conversion rules (NEP 50), overflows are not ignored; in particular, negating a unsigned int causes an overflow error. The test for `neg` has been changed to check that this error is raised.

Unpinned numpy

720568c

Also added ruff numpy2 transition rule.

Added numpy 1.26.* to CI

b633bca

Remaining tests now run on latest numpy, except for Numba jobs, which need numpy 2.1.0

brendan-m-murphy force-pushed the numpy-2-updates branch from 4dcaab3 to b633bca Compare February 17, 2025 15:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Numpy 2 updates #4

Numpy 2 updates #4

Uh oh!

brendan-m-murphy commented Feb 4, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Numpy 2 updates #4

Are you sure you want to change the base?

Numpy 2 updates #4

Uh oh!

Conversation

brendan-m-murphy commented Feb 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Checklist

Type of change

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

brendan-m-murphy commented Feb 4, 2025 •

edited

Loading