Skip to content

Commit fe8dbdf

Browse files
MarsBarLeenoatamir
andauthored
[BLOG] Add labs-status-update (#552)
Co-authored-by: Noa Tamir <[email protected]>
1 parent 2b2e112 commit fe8dbdf

File tree

3 files changed

+189
-0
lines changed

3 files changed

+189
-0
lines changed
Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
---
2+
title: 'Labs update and April highlights'
3+
published: May 3, 2019
4+
author: ralf-gommers
5+
description: "It has been an exciting first month for me at Quansight Labs. It's a good time for a summary of what we worked on in April and what is coming next."
6+
category: [Funding, PyData ecosystem]
7+
featuredImage:
8+
src: /posts/labs-status-update-2019-04/blog_feature_org.svg
9+
alt: 'An illustration of a brown and a white hand coming towards each other to pass a business card with the logo of Quansight Labs'
10+
hero:
11+
imageSrc: /posts/labs-status-update-2019-04/blog_hero_org.svg
12+
imageAlt: 'An illustration of a brown hand holding up a microphone, with some graphical elements highlighting the top of the microphone.'
13+
---
14+
15+
It has been an exciting first month for me at Quansight Labs. It's a good time
16+
for a summary of what we worked on in April and what is coming next.
17+
18+
## Progress on array computing libraries
19+
20+
Our first bucket of activities I'd call "innovation". The most prominent
21+
projects in this bucket are [XND](https://xnd.io/),
22+
[uarray](https://uarray.readthedocs.io/en/latest/),
23+
[metadsl](https://github.com/Quansight-Labs/metadsl),
24+
[python-moa](https://github.com/Quansight-Labs/python-moa),
25+
[Remote Backend Compiler](https://github.com/xnd-project/rbc) and
26+
[arrayviews](https://github.com/xnd-project/arrayviews).
27+
XND is an umbrella name for a set of related array
28+
computing libraries: `xnd`, `ndtypes`, `gumath`, and `xndtools`.
29+
30+
Hameer Abbasi made some major steps forward with `uarray`: the backend and
31+
coercion semantics are now largely worked out, there is
32+
good [documentation](https://uarray.readthedocs.io/en/latest/), and the
33+
`unumpy` package (which currently has `numpy`, `XND` and `PyTorch` backends)
34+
is progressing well. This [blog post](https://labs.quansight.org/blog/2019/04/uarray-intro/)
35+
gives a good overview of the motivation for `uarray` and its main concepts.
36+
37+
Saul Shanabrook and Chris Ostrouchov worked out how best to put `metadsl`
38+
and `python-moa` together: `metadsl` can be used to create the API for
39+
`python-moa` to simplify the code base of the latter a lot. Chris
40+
also wrote an interesting [blog post](https://labs.quansight.org/blog/2019/04/python-moa-tensor-compiler/)
41+
explaining the MoA principles.
42+
43+
The work on XND over the last month consisted mostly of "under the hood"
44+
improvements and fixes in `xnd` and `ndtypes` by Stefan Krah. We did create
45+
a new [xnd-benchmarks](https://github.com/xnd-project/xnd-benchmarks) repository
46+
and had some interesting discussions on performance. One thing I learned is that
47+
XND has automatic multithreading and has very similar performance to NumPy + MKL
48+
for basic arithmetic operations (at least for array sizes above ~1e4 elements, the
49+
overhead for small arrays is larger). The `xnd.array` interface, which is a higher
50+
level interface than `xnd.xnd` and can be used similarly to `numpy`, is taking
51+
shape as well. One user-visible new feature worth mentioning is that xnd containers
52+
can now be serialized and pickled.
53+
54+
55+
## Work on PyData core projects
56+
57+
Most people in the team are maintainers of or contributors to one or more core
58+
projects in the PyData or SciPy stacks. Helping maintain and evolve those
59+
projects is our second bucket of activities.
60+
61+
Aaron Meurer did a lot of work on [SymPy](https://www.sympy.org), both
62+
maintenance on the SymPy internals and managing the SymPy 1.4 release. He
63+
wrote a nice blog post on the highlights in that release
64+
[here](http://labs.quansight.org/blog/2019/04/whats-new-in-sympy-14/).
65+
66+
Gonzalo Pena-Castellanos is working full-time on [Spyder](https://www.spyder-ide.org/),
67+
with guidance from Carlos Cordoba. Together they have been working very hard to get
68+
the first beta of Spyder 4 ready. Some exciting new features are also in the
69+
works, however Gonzalo will be blogging about those soon so I won't steal his
70+
thunder.
71+
72+
Ivan Ogasawara is spending some time each week on maintenance of
73+
[Ibis](https://docs.ibis-project.org/). If you're a Pandas or scikit-learn user
74+
and need to interact with SQL databases or HDFS/Spark, Ibis is worth looking into.
75+
76+
I myself have enjoyed having a little more bandwidth for NumPy and SciPy.
77+
On the technical front, this allowed me to contribute to the design discussion
78+
about an [addition](https://mail.python.org/pipermail/numpy-discussion/2019-April/079317.html)
79+
to NEP 18 (the `__array_function__` override mechanism),
80+
do the [numpydoc](https://github.com/numpy/numpydoc) 0.9 release, deal
81+
with several build issues, and review a number of PRs
82+
(the one
83+
allowing to specify [BLAS and LAPACK link order](https://github.com/numpy/numpy/pull/13132)
84+
was particularly nice). On the organizational front, I fixed the description
85+
of how donations are handled on numpy.org, finalized the
86+
[Tidelift](https://tidelift.com/) agreement for NumPy (see the
87+
[announcement](https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html)
88+
for details), helped NumPy and SciPy get accepted for the
89+
[Google Season of Docs](https://developers.google.com/season-of-docs/) program,
90+
and did everything needed to finalize the fiscal sponsorship agreement between
91+
SciPy and NumFOCUS.
92+
93+
## Jupyter and JupyterLab improvements
94+
95+
Jupyter is a key part of the PyData ecosystem. It extends well beyond that though, so I'm
96+
giving it its own bucket here. At Quansight we have a number of Jupyter core developers
97+
and contributors. Ian Rose, Saul Shanabrook, Grant Nestor and others have been very busy
98+
with both maintenance tasks and adding new features to Jupyter and JupyterLab.
99+
100+
JupyterLab is about to get support for printing (not inside the notebook, but the old-fashioned
101+
`Ctrl-P` variant). [This pull request](https://github.com/jupyterlab/jupyterlab/pull/5850)
102+
by Saul has nice screenshots showing the feature in action for whole notebooks,
103+
images, the JSON viewer and the inspector.
104+
105+
Ian worked on the third alpha release of JupyterLab 1.0, on testing and CI infrastructure,
106+
and other general maintenance tasks. He also improved PDF preview in JupyterLab, so it
107+
now [works as expected](https://github.com/jupyterlab/jupyterlab/pull/6264) in Firefox
108+
and Chrome (at least).
109+
110+
Saul added support for the [nteract Data Explorer](https://data-explorer.nteract.io/) to the JupyterLab data registry as a plugin.
111+
[This pull request](https://github.com/jupyterlab/jupyterlab-data-explorer/pull/10) shows it
112+
in action on a pandas DataFrame.
113+
114+
Other interesting features are in progress and will make their way into the main
115+
repositories soon.
116+
117+
## Starting to shape Labs
118+
119+
There is a lot of work to do to figure out for ourselves exactly what Labs
120+
will be, and then to communicate that clearly to the outside world. We have
121+
a rough idea (see [my first blog post](https://labs.quansight.org/blog/2019/04/joining-labs/)
122+
and [Travis' blog post](https://www.quansight.com/single-post/2019/04/02/Welcoming-Ralf-Gommers-as-Director-of-Quansight-Labs)), but there's a long way
123+
to go from there to having an compelling elevator pitch, a website that tells
124+
our story well, people and projects organized, a funding stream, and more.
125+
126+
One of the first things we did do is start this blog, to start communicating
127+
about the technical work we're doing. We're also going through the roadmaps
128+
to ensure they're up-to-date and to make clear that those are for _community
129+
driven projects_ that Quansight is aiming to obtain industry support for.
130+
131+
## Funding
132+
133+
We reached about 20% of our funding goal for 2019 so far, primarily with contributions
134+
from [DE Shaw](https://www.deshaw.com/), [OmniSci](https://www.omnisci.com/) and
135+
[TDK](https://www.tdk.com/).
136+
137+
Both DE Shaw and OmniSci are supporting a significant amount of work on
138+
JupyterLab, which highlights how important Jupyter and JupyterLab have become
139+
in the data science ecosystem. DE Shaw is also supporting work on projects
140+
like Dask, Numba and XND that is starting at the moment. OmniSci supports work
141+
on Ibis and Remote Backend Compiler. Finally, Quansight is working with Cal Poly
142+
(one of the [Jupyter lead institutes](https://calpolynews.calpoly.edu/news_releases/2018/May/Jupyter),
143+
together with UC Berkeley) to execute on the Project Jupyter roadmap for JupyterLab.
144+
145+
TDK is sponsoring the Spyder work I talked about above. Supporting both general
146+
maintenance for the Spyder 4 release and some interesting new features is an
147+
important contribution that helps the many engineers and scientists that use
148+
Spyder as their main development and data science interface.
149+
150+
The above is direct funding from companies for work on open source projects.
151+
Quansight also offers open-source support and consulting, as well as training
152+
around the PyData stack. Those activities also yield funds that we then use to
153+
fund the efforts of Quansight Labs. To learn more about those offerings,
154+
contact Travis (`[email protected]`), myself (`[email protected]`) or
155+
156+
157+
Besides funding from companies, we are also applying for grants. So far we have
158+
submitted two proposals to the NSF and three to NASA, on topics ranging from
159+
JupyterLab extensions for high performance computing to improving Xarray's array
160+
backend system. For most of these proposals we expect the verdict in the next
161+
1-2 months. In April we got a rejection from the NSF for a
162+
[proposal](https://figshare.com/articles/Mid-Scale_Research_Infrastructure_-_The_Scientific_Python_Ecosystem/8009441)
163+
titled "Accelerated Development of the Scientific Python Ecosystem", which we
164+
wrote together with NumFOCUS and Columbia, with the latter as lead
165+
institute (thanks goes especially to Andreas Mueller and Andy Terrel for a lot
166+
of the hard work on that proposal). The discussions triggered by that
167+
rejection have been very useful and generated a number of new ideas and
168+
contacts to follow up on in the coming months.
169+
170+
One idea that came up more than once is to clearly express the needs of these
171+
projects in public, ideally in fundable chunks and with an effort estimate attached,
172+
and then approaching both funding bodies and companies with that. This is likely
173+
to be more effective than responding to solicitations that may not be a perfect
174+
match. Quansight Labs is positioned well to either participate in or help lead such
175+
a process, and to work with companies that rely on the PyData stack in particular.
176+
177+
However we look for funding, it will be important to be clear in our messaging
178+
and transparent with the community about the ways we look for funding. I will be
179+
actively soliciting feedback on this as well, both via blog posts like these
180+
(please email me at `[email protected]` if you have ideas, questions or
181+
concerns!) and in person.
182+
183+
Finally, we are finalizing and signing a preferred partnership with NumFOCUS,
184+
where 5% of Quansight Labs funds or projects referred from NumFOCUS will be
185+
provided to NumFOCUS to sustain their efforts. NumFOCUS is an important fundament
186+
of the PyData ecosystem, and we would like to contribute to keeping it on a sound
187+
financial footing and growing NumFOCUS further.

apps/labs/public/posts/labs-status-update-2019-04/blog_feature_org.svg

Lines changed: 1 addition & 0 deletions
Loading

apps/labs/public/posts/labs-status-update-2019-04/blog_hero_org.svg

Lines changed: 1 addition & 0 deletions
Loading

0 commit comments

Comments
 (0)