Skip to content
Closed
Show file tree
Hide file tree
Changes from 36 commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
ea9c544
ENH: first draft of asarray for array_api
tupui Apr 21, 2023
28a6339
ENH: add global config env variable
tupui Apr 25, 2023
65aca1e
TST: add test infra for USE_ARRAY_API
tupui Apr 25, 2023
912e55a
TST: add some basic test cases
tupui Apr 25, 2023
de56fe7
FIX: namespace simplification for NumPy like
tupui Apr 25, 2023
ca5ff59
MAINT: refactor namespace_from_arrays to array_namespace to ease tran…
tupui Apr 26, 2023
cec8bd2
FIX: consider numpy.array_api as something else than numpy
tupui Apr 26, 2023
94d3044
ENH: add to_numpy helper
tupui Apr 26, 2023
2ef5892
ENH: use SCIPY_ARRAY_API in array_namespace and fallback to np
tupui Apr 26, 2023
ad770b8
ENH: add a compliancy layer
tupui Apr 26, 2023
039e931
ENH: dynamic env variable.
tupui Apr 26, 2023
bb8e89c
ENH: swap order compliance/flag and use directly xp.asarray
tupui Apr 27, 2023
c4a2c7f
ENH: only allow Array API arrays
tupui Apr 27, 2023
31a0300
DOC: add comprehensive docstrings
tupui Apr 27, 2023
3732e1c
ENH: add support for Array API in hierarchy
tupui May 4, 2023
1ebc347
Merge remote-tracking branch 'upstream/main' into array_api
tupui May 4, 2023
4333c99
MAINT: add missing type conversion and asarray
tupui May 9, 2023
068b4fa
ENH: add check_finite to asarray
tupui May 9, 2023
d72de86
ENH: add array api support to vq
tupui May 9, 2023
d748ed1
BUG: fix missing xp in private functions
tupui May 10, 2023
0a1aa67
BUG: fix asarray mixup
tupui May 10, 2023
2a4f2f7
TST: revert array_api_compatible in test_vq
tupui May 10, 2023
633648b
MAINT: unused import
tupui May 10, 2023
735101a
MAINT: refactor asarray to as_xparray
tupui May 11, 2023
eb79de0
MAINT: refactor asarray_namespace to as_xparray_namespace
tupui May 11, 2023
b4b9f32
BUG: fix isfinite check
tupui May 11, 2023
1623d0e
CI: add pytorch cpu workflow
tupui May 11, 2023
abdb9b2
ENH: handle None case
tupui May 11, 2023
c432d88
CI: change name workflow
tupui May 12, 2023
603fbf3
Merge remote-tracking branch 'upstream/main' into array_api
tupui May 12, 2023
a0b06cf
ENH: first draft to have --array-api-backend as option in dev.py
tupui May 12, 2023
37c4c0f
MAINT: add 'all' and use mechanism in CI
tupui May 12, 2023
afc5fdc
BUG: fix tuple empty check
tupui May 12, 2023
f900a4d
BUG: fix boolean case
tupui May 12, 2023
d6b568a
TST: add skip_if_array_api marker
tupui May 12, 2023
560b591
BUG: fix kmeans and kmeans2 scalar handling
tupui May 12, 2023
9ea70e6
TST: adjust test_vq
tupui May 12, 2023
e54c36d
TST: adjust test_hierarchy
tupui May 12, 2023
9da0cd9
MAINT: dtype('float64') to float64
tupui May 12, 2023
85c7045
BUG: fix isfinite for torch
tupui May 12, 2023
8bc56ab
TST: start to adjust test_vq for PyTorch
tupui May 12, 2023
2eb7791
ENH: add size helper
tupui May 15, 2023
4596247
MAINT: refactor np.size usages
tupui May 15, 2023
67ac6e0
MAINT: size from array_api_compat
tupui May 15, 2023
52e982d
MAINT: add isdtype from scikit-learn
tupui May 15, 2023
d8f66dd
MAINT: some array conversion
tupui May 15, 2023
488e3c3
TST: add more coverage for vq
tupui May 15, 2023
77feb2f
BUG: some fix for _convert_to_double
tupui May 15, 2023
d2d2e8d
FIX: std error
tupui May 23, 2023
e395445
MAINT: return non vanilla np but array API
tupui May 23, 2023
1146129
BUG: fix benchmark using float128
tupui May 23, 2023
bc6c61b
MAINT: mitigate dtype conversion
tupui May 23, 2023
61d5860
TST: fix array API name
tupui May 23, 2023
dd2cab5
BUG: fix xp.all
tupui May 24, 2023
0469887
ENH: add cupy support
tupui Jun 1, 2023
276d3f5
ENH: add support for PyTorch mps mode
tupui Jun 1, 2023
06e59f3
MAINT: simplify backend selection logic
tupui Jun 1, 2023
82a8aa3
MAINT/TST: add device and skip logic for non CPU
tupui Jun 1, 2023
de2257f
MAINT: change device for cupy
tupui Jun 1, 2023
85fccda
TST: skip some mps incompatible tests
tupui Jun 1, 2023
0c90e8d
TST/MAINT: address some failures with Cupy
tupui Jun 5, 2023
1d090e6
Merge remote-tracking branch 'upstream/main' into array_api
tupui Jun 12, 2023
1f3856f
TST/MAINT: fix copy and a few conversions
tupui Jun 12, 2023
ea6f58b
TST: fix conftest backend selection
tupui Jun 12, 2023
a33efc6
BUG: fix dtype check for int
tupui Jun 12, 2023
4dbde35
TST/MAINT: fix dot and coverage
tupui Jun 12, 2023
bef978c
MAINT: fix matmul specialization
tupui Jun 13, 2023
60e3103
TST: add more mps skipping
tupui Jun 13, 2023
b201851
CI: fix pytorch version
tupui Jun 14, 2023
0f022a1
TST/MAINT: some rtol and type adjustments
tupui Jun 14, 2023
226ac88
MAINT/TST: fix some type conversions and methods to functions
tupui Jun 16, 2023
f7f2568
ENH: add convenient function atleast_nd
tupui Jun 16, 2023
050a96e
TST: fix astype
tupui Jun 16, 2023
f2dfb3c
MAINT: fix mypy
tupui Jun 16, 2023
4787f50
MAINT: add array-api-compat in the deps for now.
tupui Jun 16, 2023
a38cd2e
MAINT: remove custom isdtype in favour of xp.isdtype
tupui Jun 20, 2023
2d12ad4
TST: move to_numpy helper to tests
tupui Jun 20, 2023
7aea9d3
Revert "MAINT: add array-api-compat in the deps for now."
tupui Jun 20, 2023
0e2c5b1
MAINT: add array-api-compat as a submodule
tupui Jun 20, 2023
15286a6
MAINT: add to meson in a "verbose" way
tupui Jun 20, 2023
92814ac
MAINT: adjust array_api_compat imports.
tupui Jun 20, 2023
b7455f2
MAINT: fix meson for array_api_compat tests.
tupui Jun 20, 2023
5744618
TST: fix import and collection
tupui Jun 21, 2023
5033b54
MAINT: ignore MyPy errors in vendored array_api_compat.
tupui Jun 21, 2023
8bd49ae
TST: add skip_if_array_api_backend
tupui Jun 21, 2023
50d7ea4
TST: remove some np.matrix tests
tupui Jun 21, 2023
84cfe65
CI: separate step for installing torch [skip cirrus] [skip circle]
tupui Jun 22, 2023
b9fe722
CI: fix deps definition [skip cirrus] [skip circle]
tupui Jun 22, 2023
98f8309
TST: fix string comparison of 'array_api_compat.numpy' due to submodule.
tupui Jun 22, 2023
f0c2ca4
CI: ensure pytest config is used and fix some mypy.
tupui Jun 22, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions .github/workflows/array_api.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
name: Array API

on:
push:
branches:
- maintenance/**
pull_request:
branches:
- main
- maintenance/**

permissions:
contents: read # to fetch code (actions/checkout)

env:
CCACHE_DIR: "${{ github.workspace }}/.ccache"
INSTALLDIR: "build-install"

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
pytorch_cpu:
name: Linux PyTorch CPU
# if: "github.repository == 'scipy/scipy' || github.repository == ''"
runs-on: ubuntu-22.04
strategy:
matrix:
python-version: ['3.11']
maintenance-branch:
- ${{ contains(github.ref, 'maintenance/') || contains(github.base_ref, 'maintenance/') }}
exclude:
- maintenance-branch: true

steps:
- uses: actions/checkout@v3
with:
submodules: recursive

- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: 'pip' # not using a path to also cache pytorch

- name: Install Ubuntu dependencies
run: |
sudo apt-get update
sudo apt-get install -y libopenblas-dev libatlas-base-dev liblapack-dev gfortran libgmp-dev libmpfr-dev libsuitesparse-dev ccache libmpc-dev

- name: Install Python packages
run: |
python -m pip install numpy cython pytest pytest-xdist pytest-timeout pybind11 mpmath gmpy2 pythran ninja meson click rich-click doit pydevtool pooch
# Packages for Array API testing
python -m pip install array-api-compat
python -m pip install torch --index-url https://download.pytorch.org/whl/cpu

- name: Prepare compiler cache
id: prep-ccache
shell: bash
run: |
mkdir -p "${CCACHE_DIR}"
echo "dir=$CCACHE_DIR" >> $GITHUB_OUTPUT
NOW=$(date -u +"%F-%T")
echo "timestamp=${NOW}" >> $GITHUB_OUTPUT

- name: Setup compiler cache
uses: actions/cache@v3
id: cache-ccache
with:
path: ${{ steps.prep-ccache.outputs.dir }}
key: ${{ github.workflow }}-${{ matrix.python-version }}-ccache-linux-${{ steps.prep-ccache.outputs.timestamp }}
restore-keys: |
${{ github.workflow }}-${{ matrix.python-version }}-ccache-linux-

- name: Setup build and install scipy
run: |
python dev.py build

- name: Test SciPy
run: |
export OMP_NUM_THREADS=2
export SCIPY_USE_PROPACK=1
python dev.py --no-build test --array-api-backend pytorch -s cluster -- --durations 10 --timeout=60
11 changes: 11 additions & 0 deletions dev.py
Original file line number Diff line number Diff line change
Expand Up @@ -685,6 +685,7 @@ class Test(Task):
$ python dev.py test -t scipy.optimize.tests.test_minimize_constrained
$ python dev.py test -s cluster -m full --durations 20
$ python dev.py test -s stats -- --tb=line # `--` passes next args to pytest
$ python dev.py test -b numpy -b pytorch -s cluster
```
""" # noqa: E501
ctx = CONTEXT
Expand Down Expand Up @@ -716,6 +717,13 @@ class Test(Task):
['--parallel', '-j'], default=1, metavar='N_JOBS',
help="Number of parallel jobs for testing"
)
array_api_backend = Option(
['--array-api-backend', '-b'], default=None, metavar='ARRAY_BACKEND',
multiple=True,
help=(
"Array API backend ('all', 'numpy', 'pytorch', 'numpy.array_api')."
)
)
# Argument can't have `help=`; used to consume all of `-- arg1 arg2 arg3`
pytest_args = Argument(
['pytest_args'], nargs=-1, metavar='PYTEST-ARGS', required=False
Expand Down Expand Up @@ -756,6 +764,9 @@ def scipy_tests(cls, args, pytest_args):
else:
tests = None

if len(args.array_api_backend) != 0:
os.environ['SCIPY_ARRAY_API'] = json.dumps(list(args.array_api_backend))

runner, version, mod_path = get_test_runner(PROJECT_MODULE)
# FIXME: changing CWD is not a good practice
with working_dir(dirs.site):
Expand Down
1 change: 1 addition & 0 deletions pytest.ini
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ filterwarnings =
ignore:.*The distutils.* is deprecated.*:DeprecationWarning
ignore:\s*.*numpy.distutils.*:DeprecationWarning
ignore:.*The --rsyncdir command line argument.*:DeprecationWarning
ignore:.*The numpy.array_api submodule is still experimental.*:UserWarning
186 changes: 186 additions & 0 deletions scipy/_lib/_array_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
"""Utility functions to use Python Array API compatible libraries.

For the context about the Array API see:
https://data-apis.org/array-api/latest/purpose_and_scope.html

The SciPy use case of the Array API is described on the following page:
https://data-apis.org/array-api/latest/use_cases.html#use-case-scipy
"""
import os

import numpy as np
from numpy.core.numerictypes import typecodes
# probably want to vendor it (submodule)
import array_api_compat
import array_api_compat.numpy

__all__ = ['array_namespace', 'as_xparray', 'as_xparray_namespace']


# SCIPY_ARRAY_API, array_api_dispatch is used by sklearn
array_api_dispatch = os.environ.get("array_api_dispatch", False)
SCIPY_ARRAY_API = os.environ.get("SCIPY_ARRAY_API", array_api_dispatch)

_GLOBAL_CONFIG = {"SCIPY_ARRAY_API": SCIPY_ARRAY_API}


def compliance_scipy(*arrays):
"""Raise exceptions on known-bad subclasses.

The following subclasses are not supported and raise and error:
- `np.ma.MaskedArray`
- `numpy.matrix`
- Any array-like which is not Array API compatible
"""
for array in arrays:
if isinstance(array, np.ma.MaskedArray):
raise TypeError("'numpy.ma.MaskedArray' are not supported")
elif isinstance(array, np.matrix):
raise TypeError("'numpy.matrix' are not supported")
elif not array_api_compat.is_array_api_obj(array):
raise TypeError("Only support Array API compatible arrays")
elif array.dtype is np.dtype('O'):
raise ValueError('object arrays are not supported')


def _check_finite(array, xp):
"""Check for NaNs or Infs."""
# same as np.asarray_chkfinite
if array.dtype.char in typecodes['AllFloat'] and not xp.isfinite(array).all():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here--my experimentation suggested that the single-character typecodes dtype.char business wasn't supported by torch nor required by the array API?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I just saw that on my side now haha 😉

raise ValueError(
"array must not contain infs or NaNs"
)


def array_namespace(*arrays):
"""Get the array API compatible namespace for the arrays xs.

Parameters
----------
*arrays : sequence of array_like
Arrays used to infer the common namespace.

Returns
-------
namespace : module
Common namespace.

Notes
-----
Thin wrapper around `array_api_compat.array_namespace`.

1. Check for the global switch: SCIPY_ARRAY_API. This can also be accessed
dynamically through ``_GLOBAL_CONFIG['SCIPY_ARRAY_API']``.
2. `compliance_scipy` raise exceptions on known-bad subclasses. See
it's definition for more details.

When the global switch is False, it defaults to the `numpy` namespace.
In that case, there is no compliance check. This is a convenience to
ease the adoption. Otherwise, arrays must comply with the new rules.
"""
if not _GLOBAL_CONFIG["SCIPY_ARRAY_API"]:
# here we could wrap the namespace if needed
return np

arrays = [array for array in arrays if array is not None]

compliance_scipy(*arrays)

return array_api_compat.array_namespace(*arrays)


def as_xparray(
array, dtype=None, order=None, copy=None, *, xp=None, check_finite=True
):
"""Drop-in replacement for `np.asarray`.

Memory layout parameter `order` is not exposed in the Array API standard.
`order` is only enforced if the input array implementation
is NumPy based, otherwise `order` is just silently ignored.

The purpose of this helper is to make it possible to share code for data
container validation without memory copies for both downstream use cases.
"""
if xp is None:
xp = array_namespace(array)
if xp.__name__ in {"numpy", "array_api_compat.numpy", "numpy.array_api"}:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one's actually doing it yet as far as I know, but this wouldn't work if someone vendors array_api_compat and tries to call a scipy function.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't know if it makes sense to list numpy.array_api here. That namespace is designed to only support a strict implementation of the standard, which doesn't include order.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For scikit-learn, we wanted the performance with numpy.array_api to be the same as numpy. When one explicitly sets the order, there is usually a performance reason for doing so. For example:

def scipy_func(X):
    xp = array_namespace(X)
    
    # switch order for performance reasons
    X_f = asarray(X, xp, order="F")
    
    # Do some operations that require prefer F ordered.
    return xp.sum(X_f, axis=0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way something like

_X = numpy.asarray(..., order="F")
X = numpy.array_api.asarray(X)

will also work. That's maybe a little more "spec compliant" in the sense that converting arrays from one library to another with asarray is supported. In this case it's a trivial zero-copy wrapping but in general it will use DLPack (although I don't know how DLPack handles order, so maybe someone could confirm whether this would actually work in a more general setting).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually that's wrong. I thought asarray in the spec used dlpack, but it's only numpy.asarray that does. In the spec you have to use from_dlpack (I'm not sure why they are separate).

# Use NumPy API to support order
if copy is True:
array = np.array(array, order=order, dtype=dtype)
else:
array = np.asarray(array, order=order, dtype=dtype)

# At this point array is a NumPy ndarray. We convert it to an array
# container that is consistent with the input's namespace.
array = xp.asarray(array)
else:
array = xp.asarray(array, dtype=dtype, copy=copy)

if check_finite:
_check_finite(array, xp)

return array


def as_xparray_namespace(*arrays):
"""Validate and convert arrays to a common namespace.

Parameters
----------
*arrays : sequence of array_like
Arrays to validate and convert.

Returns
-------
*arrays : sequence of array_like
Validated and converted arrays to the common namespace.
namespace : module
Common namespace.

Notes
-----
This function is meant to be called from each public function in a SciPy
submodule it does the following:

1. Check for the global switch: SCIPY_ARRAY_API. This can also be accessed
dynamically through ``_GLOBAL_CONFIG['SCIPY_ARRAY_API']``.
2. `compliance_scipy` raise exceptions on known-bad subclasses. See
it's definition for more details.
3. Determine the namespace, without doing any coercion of array(-like)
inputs.
4. Call `xp.asarray` on all array.

Examples
--------
>>> import numpy as np
>>> x, y, xp = as_xparray_namespace(np.array([0, 1, 2]), np.array([0, 1, 2]))
>>> xp.__name__
'array_api_compat.numpy'
>>> x, y
(array([0, 1, 2]), array([0, 1, 2]))

"""
arrays = list(arrays)
xp = array_namespace(*arrays)

for i, array in enumerate(arrays):
arrays[i] = xp.asarray(array)

return *arrays, xp


def to_numpy(array, xp):
"""Convert `array` into a NumPy ndarray on the CPU.

ONLY FOR TESTING
"""
xp_name = xp.__name__

if xp_name in {"array_api_compat.torch", "torch"}:
return array.cpu().numpy()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't want to do this. Just using np.asarray(array) is better. As a rule, never do silent device transfers from GPU to CPU or vice verse. This is true for CuPy too - that should simply error.

This is specially useful to pass arrays to Cython.

This conversation was specifically about PyTorch CPU tensors. There may be other cpu array libraries where this works. But basically, this function doesn't seem needed, in any code where you planned to use this I think you want np.asarray instead.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will remove 👍 This was in sklearn so I thought this was more "optimal" to do the conversion using that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting. @thomasjpfan I thought you preferred exceptions?

Copy link

@thomasjpfan thomasjpfan Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scikit-learn has a function to do device transfer strictly for testing purposes and is not part of public API. Scikit-learn checks that the GPU implementation gives similar results as the CPU implementation.

This type of testing does not have full coverage, but it's nice to have some checks to make sure the GPU code paths does something reasonable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, for testing it makes sense indeed. As long as we figure out how to make sure to not accidentally introduce it into the code base outside of testing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some utilities for testing back on the host from the device in my branch.. my approach was quite different though.. perhaps I can try to make a PR to this branch but...

elif xp_name == "cupy.array_api":
return array._array.get()
elif xp_name in {"array_api_compat.cupy", "cupy"}: # pragma: nocover
return array.get()

return np.asarray(array)
1 change: 1 addition & 0 deletions scipy/_lib/meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ py3.extension_module('messagestream',

python_sources = [
'__init__.py',
'_array_api.py',
'_bunch.py',
'_ccallback.py',
'_disjoint_set.py',
Expand Down
1 change: 1 addition & 0 deletions scipy/_lib/tests/meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ python_sources = [
'test_public_api.py',
'test_tmpdirs.py',
'test_warnings.py',
'test_array_api.py',
]

py3.install_sources(
Expand Down
Loading