@@ -18,25 +18,19 @@ may include (among other things) interoperability protocols, better duck typing
18
18
support and ndarray subclass handling.
19
19
20
20
The key goal is: *make it easy for code written for NumPy to also work with
21
- other NumPy-like projects. * This will enable GPU support via, e.g, CuPy or JAX ,
21
+ other NumPy-like projects. * This will enable GPU support via, e.g, CuPy, JAX or PyTorch ,
22
22
distributed array support via Dask, and writing special-purpose arrays (either
23
23
from scratch, or as a ``numpy.ndarray `` subclass) that work well with SciPy,
24
- scikit-learn and other such packages.
24
+ scikit-learn and other such packages. A large step forward in this area was
25
+ made in NumPy 2.0, with adoption of and compliance with the array API standard
26
+ (v2022.12, see :ref: `NEP47 `). Future work in this direction will include
27
+ support for newer versions of the array API standard, and adding features as
28
+ needed based on real-world experience and needs.
25
29
26
- The ``__array_ufunc__ `` and ``__array_function__ `` protocols are stable, but
27
- do not cover the whole API. New protocols for overriding other functionality
28
- in NumPy are needed. Work in this area aims to bring to completion one or more
29
- of the following proposals:
30
-
31
- - :ref: `NEP30 `
32
- - :ref: `NEP31 `
33
- - :ref: `NEP35 `
34
- - :ref: `NEP37 `
35
-
36
- In addition we aim to provide ways to make it easier for other libraries to
37
- implement a NumPy-compatible API. This may include defining consistent subsets
38
- of the API, as discussed in `this section of NEP 37
39
- <https://numpy.org/neps/nep-0037-array-module.html#requesting-restricted-subsets-of-numpy-s-api> `__.
30
+ In addition, the ``__array_ufunc__ `` and ``__array_function__ `` protocols
31
+ fulfill a role here - they are stable and used by several downstream projects.
32
+ They do not cover the whole API, so use of the array API standard is preferred
33
+ for new code.
40
34
41
35
42
36
Performance
@@ -46,17 +40,26 @@ Improvements to NumPy's performance are important to many users. We have
46
40
focused this effort on Universal SIMD (see :ref: `NEP38 `) intrinsics which
47
41
provide nice improvements across various hardware platforms via an abstraction
48
42
layer. The infrastructure is in place, and we welcome follow-on PRs to add
49
- SIMD support across all relevant NumPy functions.
43
+ SIMD support across relevant NumPy functionality.
44
+
45
+ Transitioning from C to C++, both in the SIMD infrastructure and in NumPy
46
+ internals more widely, is in progress. We have also started to make use of
47
+ Google Highway (see :ref: `NEP54 `), and that usage is likely to expand. Work
48
+ towards support for newer SIMD instruction sets, like SVE on arm64, is ongoing.
50
49
51
50
Other performance improvement ideas include:
52
51
53
- - A better story around parallel execution.
52
+ - A better story around parallel execution (related is support for free-threaded
53
+ CPython, see further down).
54
54
- Optimizations in individual functions.
55
- - Reducing ufunc and ``__array_function__ `` overhead.
56
55
57
56
Furthermore we would like to improve the benchmarking system, in terms of coverage,
58
- easy of use, and publication of the results (now
59
- `here <https://pv.github.io/numpy-bench >`__) as part of the docs or website.
57
+ easy of use, and publication of the results. Benchmarking PRs/branches compared
58
+ to the `main ` branch is a primary purpose, and required for PRs that are
59
+ performance-focused (e.g., adding SIMD acceleration to a function). In
60
+ addition, we'd like a performance overview like the one we had `here
61
+ <https://pv.github.io/numpy-bench> `__, set up in a way that is more
62
+ maintainable long-term.
60
63
61
64
62
65
Documentation and website
@@ -68,69 +71,144 @@ documentation on many topics are missing or outdated. See :ref:`NEP44` for
68
71
planned improvements. Adding more tutorials is underway in the
69
72
`numpy-tutorials repo <https://github.com/numpy/numpy-tutorials >`__.
70
73
71
- Our website (https://numpy.org) was completely redesigned recently. We aim to
72
- further improve it by adding translations, more case studies and other
73
- high-level content, and more (see `this tracking issue <https://github.com/numpy/numpy.org/issues/266 >`__).
74
+ We also intend to make all the example code in our documentation interactive -
75
+ work is underway to do so via ``jupyterlite-sphinx `` and Pyodide.
76
+
77
+ Our website (https://numpy.org) is in good shape. Further work on expanding the
78
+ number of languages that the website is translated in is desirable. As are
79
+ improvements to the interactive notebook widget, through JupyterLite.
74
80
75
81
76
82
Extensibility
77
83
-------------
78
84
79
- We aim to make it much easier to extend NumPy. The primary topic here is to
80
- improve the dtype system - see :ref: `NEP41 ` and related NEPs linked from it.
81
- Concrete goals for the dtype system rewrite are:
82
-
83
- - Easier custom dtypes:
85
+ We aim to continue making it easier to extend NumPy. The primary topic here is to
86
+ improve the dtype system - see for example :ref: `NEP41 ` and related NEPs linked
87
+ from it. In NumPy 2.0, a ` new C API for user-defined dtypes < https://numpy.org/devdocs/reference/c-api/array.html#custom-data-types >`__
88
+ was made public. We aim to encourage its usage and improve this API further,
89
+ including support for writing a dtype in Python.
84
90
85
- - Simplify and/or wrap the current C-API
86
- - More consistent support for dtype metadata
87
- - Support for writing a dtype in Python
91
+ Ideas for new dtypes that may be developed outside of the main NumPy repository
92
+ first, and that could potentially be upstreamed into NumPy later, include:
88
93
89
- - Allow adding (a) new string dtype(s). This could be encoded strings with
90
- fixed-width storage (e.g., ``utf8 `` or ``latin1 ``), and/or a variable length
91
- string dtype. The latter could share an implementation with ``dtype=object ``,
92
- but be explicitly type-checked.
93
- One of these should probably be the default for text data. The current
94
- string dtype support is neither efficient nor user friendly.
94
+ - A quad-precision (128-bit) dtype
95
+ - A ``bfloat16 `` dtype
96
+ - A fixed-width string dtype which supports encodings (e.g., ``utf8 `` or
97
+ ``latin1 ``)
98
+ - A unit dtype
95
99
96
100
97
101
User experience
98
102
---------------
99
103
100
104
Type annotations
101
105
````````````````
102
- NumPy 1.20 adds type annotations for most NumPy functionality, so users can use
103
- tools like `mypy `_ to type check their code and IDEs can improve their support
106
+ Type annotations for most NumPy functionality is complete (although some
107
+ submodules like ``numpy.ma `` are missing return types), so users can use tools
108
+ like `mypy `_ to type check their code and IDEs can improve their support
104
109
for NumPy. Improving those type annotations, for example to support annotating
105
- array shapes and dtypes, is ongoing.
110
+ array shapes (see `gh-16544 <https://github.com/numpy/numpy/issues/16544 >`__),
111
+ is ongoing.
106
112
107
113
Platform support
108
114
````````````````
109
115
We aim to increase our support for different hardware architectures. This
110
116
includes adding CI coverage when CI services are available, providing wheels on
111
- PyPI for POWER8/9 (``ppc64le ``), providing better build and install
112
- documentation, and resolving build issues on other platforms like AIX.
117
+ PyPI for platforms that are in high enough demand (e.g., we added ``musllinux ``
118
+ ones for NumPy 2.0), and resolving build issues on platforms that we don't test
119
+ in CI (e.g., AIX).
120
+
121
+ We intend to write a NEP covering the support levels we provide and what is
122
+ required for a platform to move to a higher tier of support, similar to
123
+ `PEP 11 <https://peps.python.org/pep-0011/ >`__.
124
+
125
+ Support for free-threaded CPython
126
+ `````````````````````````````````
127
+ CPython 3.13 will be the first release to offer a free-threaded build (i.e.,
128
+ a CPython build with the GIL disabled). Work is in progress to support this
129
+ well in NumPy. After that is stable and complete, there may be opportunities to
130
+ actually make use of the potential for performance improvements from
131
+ free-threaded CPython, or make it easier to do so for NumPy's users.
132
+
133
+ Binary size reduction
134
+ `````````````````````
135
+ The number of downloads of NumPy from PyPI and other platforms continues to
136
+ increase - as of May 2024 we're at >200 million downloads/month from PyPI
137
+ alone. Reducing the size of an installed NumPy package has many benefits:
138
+ faster installs, lower disk space usage, smaller load on PyPI, less
139
+ environmental impact, easier to fit more packages on top of NumPy in
140
+ resource-constrained environments and platforms like AWS Lambda, lower latency
141
+ for Pyodide users, and so on. We aim for significant reductions, as well as
142
+ making it easier for end users and packagers to produce smaller custom builds
143
+ (e.g., we added support for stripping tests before 2.1.0). See
144
+ `gh-25737 <https://github.com/numpy/numpy/issues/25737 >`__ for details.
145
+
146
+ Support use of CPython's limited C API
147
+ ``````````````````````````````````````
148
+ Use of the CPython limited C API, allowing producing ``abi3 `` wheels that use
149
+ the stable ABI and are hence independent of CPython feature releases, has
150
+ benefits for both downstream packages that use NumPy's C API and for NumPy
151
+ itself. In NumPy 2.0, work was done to enable using the limited C API with
152
+ the Cython support in NumPy (see `gh-25531 <https://github.com/numpy/numpy/pull/25531 `__).
153
+ More work and testing is needed to ensure full support for downstream packages.
154
+
155
+ We also want to explore what is needed for NumPy itself to use the limited
156
+ C API - this would make testing new CPython dev and pre-release versions across
157
+ the ecosystem easier, and significantly reduce the maintenance effort for CI
158
+ jobs in NumPy itself.
159
+
160
+ Create a header-only package for NumPy
161
+ ``````````````````````````````````````
162
+ We have reduced the platform-dependent content in the public NumPy headers to
163
+ almost nothing. It is now feasible to create a separate package with only
164
+ NumPy headers and a discovery mechanism for them, in order to enable downstream
165
+ packages to build against the NumPy C API without having NumPy installed.
166
+ This will make it easier/cheaper to use NumPy's C API, especially on more
167
+ niche platforms for which we don't provide wheels.
168
+
169
+
170
+ NumPy 2.0 stabilization & downstream usage
171
+ ------------------------------------------
172
+
173
+ We made a very large amount of changes (and improvements!) in NumPy 2.0. The
174
+ release process has taken a very long time, and part of the ecosystem is still
175
+ catching up. We may need to slow down for a while, and possibly help the rest
176
+ of the ecosystem with adapting to the ABI and API changes.
177
+
178
+ We will need to assess the costs and benefits to NumPy itself,
179
+ downstream package authors, and end users. Based on that assessment, we need to
180
+ come to a conclusion on whether it's realistic to do another ABI-breaking
181
+ release again in the future or not. This will also inform the future evolution
182
+ of our C API.
183
+
184
+
185
+ Security
186
+ --------
187
+
188
+ NumPy is quite secure - we get only a limited number of reports about potential
189
+ vulnerabilities, and most of those are incorrect. We have made strides with a
190
+ documented security policy, a private disclosure method, and maintaining an
191
+ OpenSSF scorecard (with a high score). However, we have not changed much in how
192
+ we approach supply chain security in quite a while. We aim to make improvements
193
+ here, for example achieving fully reproducible builds for all the build
194
+ artifacts we publish - and providing full provenance information for them.
113
195
114
196
115
197
Maintenance
116
198
-----------
117
199
118
- - ``MaskedArray `` needs to be improved, ideas include:
200
+ - ``numpy.ma `` is still in poor shape and under-maintained. It needs to be
201
+ improved, ideas include:
119
202
120
203
- Rewrite masked arrays to not be a ndarray subclass -- maybe in a separate project?
121
204
- MaskedArray as a duck-array type, and/or
122
205
- dtypes that support missing values
123
206
124
- - Fortran integration via ``numpy.f2py `` requires a number of improvements, see
125
- `this tracking issue <https://github.com/numpy/numpy/issues/14938 >`__.
126
- - A backend system for ``numpy.fft `` (so that e.g. ``fft-mkl `` doesn't need to monkeypatch numpy).
127
207
- Write a strategy on how to deal with overlap between NumPy and SciPy for ``linalg ``.
128
- - Deprecate ``np.matrix `` (very slowly).
208
+ - Deprecate ``np.matrix `` (very slowly) - this is feasible once the switch-over
209
+ from sparse matrices to sparse arrays in SciPy is complete.
129
210
- Add new indexing modes for "vectorized indexing" and "outer indexing" (see :ref: `NEP21 `).
130
211
- Make the polynomial API easier to use.
131
- - Integrate an improved text file loader.
132
- - Ufunc and gufunc improvements, see `gh-8892 <https://github.com/numpy/numpy/issues/8892 >`__
133
- and `gh-11492 <https://github.com/numpy/numpy/issues/11492 >`__.
134
212
135
213
136
214
.. _`mypy` : https://mypy.readthedocs.io
0 commit comments