Skip to content

Commit 2561ef9

Browse files
MarsBarLeegabalafou
authored andcommitted
Add file and images
1 parent cf3da05 commit 2561ef9

File tree

3 files changed

+333
-0
lines changed

3 files changed

+333
-0
lines changed

apps/labs/posts/uarray-intro.md

Lines changed: 331 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,331 @@
1+
---
2+
title: "`uarray`: A Generic Override Framework for Methods"
3+
author: hameer-abbasi
4+
published: April 30, 2019
5+
description: 'The problem is, stated simply: How do we use all of the PyData libraries in tandem, moving seamlessly from one to the other, without actually changing the API, or even the imports?'
6+
category: [PyData ecosystem]
7+
featuredImage:
8+
src: /posts/hello-world-post/blog_hero_var2.svg
9+
alt: 'An illustration of a brown and a white hand coming towards each other to pass a business card with the logo of Quansight Labs.'
10+
hero:
11+
imageSrc: /posts/hello-world-post/blog_feature_org/svg
12+
imageAlt: 'An illustration of a brown hand holding up a microphone, with some graphical elements highlighting the top of the microphone.'
13+
---
14+
15+
`uarray` is an override framework for methods in Python. In the
16+
scientific Python ecosystem, and in other similar places, there has been
17+
one recurring problem: That similar tools to do a job have existed, but
18+
don't conform to a single, well-defined API. `uarray` tries to solve
19+
this problem in general, but also for the scientific Python ecosystem in
20+
particular, by defining APIs independent of their implementations.
21+
22+
## Array Libraries in the Scientific Python Ecosystem
23+
24+
When SciPy was created, and Numeric and Numarray unified into NumPy, it
25+
jump-started Python's data science community. The ecosystem grew
26+
quickly: Academics started moving to SciPy, and the Scikits that popped
27+
up made the transition all the more smooth.
28+
29+
However, the scientific Python community also shifted during that time:
30+
GPUs and distributed computing emerged. Also, there were old ideas that
31+
couldn't really be used with NumPy's API, such as sparse arrays. To
32+
solve these problems, various libraries emerged:
33+
34+
- Dask, for distributed NumPy
35+
- CuPy, for NumPy on Nvidia-branded GPUs.
36+
- PyData/Sparse, a project started to make sparse arrays conform to
37+
the NumPy API
38+
- Xnd, which extends the type system and the universal function
39+
concept found in NumPy
40+
41+
There were yet other libraries that emerged: PyTorch, which mimics NumPy
42+
to a certain degree; TensorFlow, which defines its own API; and MXNet,
43+
which is another deep learning framework that mimics NumPy.
44+
45+
## The Problem
46+
47+
The problem is, stated simply: How do we use all of these libraries in
48+
tandem, moving seamlessly from one to the other, without actually
49+
changing the API, or even the imports? How do we take functions written
50+
for one library and allow it to be used by another, without, as Travis
51+
Oliphant so eloquently puts it, \"re-writing the world\"?
52+
53+
In my mind, the goals are (stated abstractly):
54+
55+
1. Methods that are not tied to a specific implementation.
56+
57+
- For example `np.arange`
58+
59+
1. Backends that implement these methods.
60+
61+
- NumPy, Dask, PyTorch are all examples of this.
62+
63+
1. Coercion of objects to other forms to move between backends.
64+
65+
- This means converting a NumPy array to a Dask array, and vice versa.
66+
67+
In addition, we wanted to be able to do this for arbitrary objects. So
68+
`dtype`s, `ufunc`s etc. should also be dispatchable and coercible.
69+
70+
## The Solution?
71+
72+
With that said, let's dive into `uarray`. If you're not interested in
73+
the gory details, you can jump down to
74+
`<a href="#how-to-use-it">`{=html}this section`</a>`{=html}.
75+
76+
``` python
77+
import uarray as ua
78+
79+
# Let's ignore this for now
80+
def myfunc_rd(a, kw, d):
81+
return a, kw
82+
83+
# We define a multimethod
84+
@ua.create_multimethod(myfunc_rd)
85+
def myfunc():
86+
return () # Let's also ignore this for now
87+
88+
# Now let's define two backends!
89+
be1 = ua.Backend()
90+
be2 = ua.Backend()
91+
92+
# And register their implementations for the method!
93+
@ua.register_implementation(myfunc, backend=be1)
94+
def myfunc_be1(): # Note that it has exactly the same signature
95+
return "Potato"
96+
97+
@ua.register_implementation(myfunc, backend=be2)
98+
def myfunc_be2(): # Note that it has exactly the same signature
99+
return "Strawberry"
100+
```
101+
102+
``` python
103+
with ua.set_backend(be1):
104+
print(myfunc())
105+
```
106+
107+
Potato
108+
109+
``` python
110+
with ua.set_backend(be2):
111+
print(myfunc())
112+
```
113+
114+
Strawberry
115+
116+
As we can clearly see: We have already provided a way to do (1) and (2)
117+
above. But then we run across the problem: How do we decide between
118+
these backends? How do we move between them? Let's go ahead and
119+
register both of these backends for permanent use. And see what happens
120+
when we want to implement both of their methods!
121+
122+
``` python
123+
ua.register_backend(be1)
124+
ua.register_backend(be2)
125+
```
126+
127+
``` python
128+
print(myfunc())
129+
```
130+
131+
Potato
132+
133+
As we see, we get only the first backend's answer. In general, it's
134+
indeterminate what backend will be selected. But, this is a special
135+
case: We're not passing arguments in! What if we change one of these to
136+
return `NotImplemented`?
137+
138+
``` python
139+
# We redefine the multimethod so it's new again
140+
@ua.create_multimethod(myfunc_rd)
141+
def myfunc():
142+
return ()
143+
144+
# Now let's redefine the two backends!
145+
be1 = ua.Backend()
146+
be2 = ua.Backend()
147+
148+
# And register their implementations for the method!
149+
@ua.register_implementation(myfunc, backend=be1)
150+
def myfunc_be1(): # Note that it has exactly the same signature
151+
return NotImplemented
152+
153+
@ua.register_implementation(myfunc, backend=be2)
154+
def myfunc_be2(): # Note that it has exactly the same signature
155+
return "Strawberry"
156+
157+
ua.register_backend(be1)
158+
ua.register_backend(be2)
159+
```
160+
161+
``` python
162+
with ua.set_backend(be1):
163+
print(myfunc())
164+
```
165+
166+
Strawberry
167+
168+
Wait\... What? Didn't we just set the first `Backend`? Ahh, but, you
169+
see\... It's signalling that it has *no* implementation for `myfunc`.
170+
The same would happen if you simply didn't register one. To force a
171+
`Backend`, we must use `only=True` or `coerce=True`, the difference will
172+
be explained in just a moment.
173+
174+
``` python
175+
with ua.set_backend(be1, only=True):
176+
print(myfunc())
177+
```
178+
179+
---------------------------------------------------------------------------
180+
BackendNotImplementedError Traceback (most recent call last)
181+
<ipython-input-8-ec856cf7c88b> in <module>
182+
1 with ua.set_backend(be1, only=True):
183+
----> 2 print(myfunc())
184+
185+
~/Quansight/uarray/uarray/backend.py in __call__(self, *args, **kwargs)
186+
108
187+
109 if result is NotImplemented:
188+
--> 110 raise BackendNotImplementedError('No selected backends had an implementation for this method.')
189+
111
190+
112 return result
191+
192+
BackendNotImplementedError: No selected backends had an implementation for this method.
193+
194+
Now we are told that no backends had an implementation for this function
195+
(which is nice, good error messages are nice!)
196+
197+
## Coercion and passing between backends
198+
199+
Let's say we had two `Backend`s. Let's choose the completely useless
200+
example of one storing a number as an `int` and one as a `float`.
201+
202+
``` python
203+
class Number(ua.DispatchableInstance):
204+
pass
205+
206+
def myfunc_rd(args, kwargs, dispatchable_args):
207+
# Here, we're "replacing" the dispatchable args with the ones supplied.
208+
# In general, this may be more complex, like inserting them in between
209+
# other args and kwargs.
210+
return dispatchable_args, kwargs
211+
212+
@ua.create_multimethod(myfunc_rd)
213+
def myfunc(a):
214+
# Here, we're marking a as a Number, and saying that "we want to dispatch/convert over this"
215+
# We return as a tuple as there may be more dispatchable arguments
216+
return (Number(a),)
217+
218+
Number.register_convertor(be1, lambda x: int(x))
219+
Number.register_convertor(be2, lambda x: str(x))
220+
```
221+
222+
Let's also define a \"catch-all\" method. This catches all
223+
implementations of methods not already registered.
224+
225+
``` python
226+
# This can be arbitrarily complex
227+
def gen_impl1(method, args, kwargs, dispatchable_args):
228+
if not all(isinstance(a, Number) and isinstance(a.value, int) for a in dispatchable_args):
229+
return NotImplemented
230+
231+
return args[0]
232+
233+
# This can be arbitrarily complex
234+
def gen_impl2(method, args, kwargs, dispatchable_args):
235+
if not all(isinstance(a, Number) and isinstance(a.value, str) for a in dispatchable_args):
236+
return NotImplemented
237+
238+
return args[0]
239+
240+
be1.register_implementation(None, gen_impl1)
241+
be2.register_implementation(None, gen_impl2)
242+
```
243+
244+
``` python
245+
myfunc('1') # This calls the second implementation
246+
```
247+
248+
'1'
249+
250+
``` python
251+
myfunc(1) # This calls the first implementation
252+
```
253+
254+
1
255+
256+
``` python
257+
myfunc(1.0) # This fails
258+
```
259+
260+
---------------------------------------------------------------------------
261+
BackendNotImplementedError Traceback (most recent call last)
262+
<ipython-input-13-8431c1275db5> in <module>
263+
----> 1 myfunc(1.0) # This fails
264+
265+
~/Quansight/uarray/uarray/backend.py in __call__(self, *args, **kwargs)
266+
108
267+
109 if result is NotImplemented:
268+
--> 110 raise BackendNotImplementedError('No selected backends had an implementation for this method.')
269+
111
270+
112 return result
271+
272+
BackendNotImplementedError: No selected backends had an implementation for this method.
273+
274+
``` python
275+
# But works if we do this:
276+
277+
with ua.set_backend(be1, coerce=True):
278+
print(type(myfunc(1.0)))
279+
280+
with ua.set_backend(be2, coerce=True):
281+
print(type(myfunc(1.0)))
282+
```
283+
284+
<class 'int'>
285+
<class 'str'>
286+
287+
This may seem like too much work, but remember that it's broken down
288+
into a lot of small steps:
289+
290+
1. Extract the dispatchable arguments.
291+
2. Realise the types of the dispatchable arguments.
292+
3. Convert them.
293+
4. Place them back into args/kwargs
294+
5. Call the right function.
295+
296+
Note that `only=True` does not coerce, just enforces the backend
297+
strictly.
298+
299+
With this, we have solved problem (3). Now remains the grunt-work of
300+
actually retrofitting the NumPy API into `unumpy` and extracting the
301+
right values from it.
302+
303+
## How To Use It Today
304+
305+
`unumpy` is a set of NumPy-related multimethods built on top of
306+
`uarray`. You can use them as follows:
307+
308+
``` python
309+
import unumpy as np # Note the changed import statement
310+
from unumpy.xnd_backend import XndBackend
311+
312+
with ua.set_backend(XndBackend):
313+
print(type(np.arange(0, 100, 1)))
314+
```
315+
316+
<class 'xnd.array'>
317+
318+
And, as you can see, we get back an Xnd array when using a NumPy-like
319+
API. Currently, there are three back-ends: NumPy, Xnd and PyTorch. The
320+
NumPy and Xnd backends have feature parity, while the PyTorch backend is
321+
still being worked on.
322+
323+
We are also working on supporting more of the NumPy API, and dispatching
324+
over dtypes.
325+
326+
Feel free to browse the source and open issues at:
327+
<https://github.com/Quansight-Labs/uarray> or shoot me an email at
328+
`<a href="mailto:[email protected]">`{=html}[email protected]`</a>`{=html}
329+
if you want to contact me directly. You can also find the full
330+
documentation at <https://uarray.readthedocs.io/en/latest/>.
331+

apps/labs/public/posts/uarray-intro/blog_hero_var1.svg

Lines changed: 1 addition & 0 deletions
Loading

apps/labs/public/posts/uarray-intro/blog_hero_var2.svg

Lines changed: 1 addition & 0 deletions
Loading

0 commit comments

Comments
 (0)