|
| 1 | +--- |
| 2 | +title: 'Labs update and April highlights' |
| 3 | +published: May 3, 2019 |
| 4 | +author: ralf-gommers |
| 5 | +description: "It has been an exciting first month for me at Quansight Labs. It's a good time for a summary of what we worked on in April and what is coming next." |
| 6 | +category: [Funding, PyData ecosystem] |
| 7 | +featuredImage: |
| 8 | + src: /posts/labs-status-update-2019-04/blog_feature_org.svg |
| 9 | + alt: 'An illustration of a brown and a white hand coming towards each other to pass a business card with the logo of Quansight Labs' |
| 10 | +hero: |
| 11 | + imageSrc: /posts/labs-status-update-2019-04/blog_hero_org.svg |
| 12 | + imageAlt: 'An illustration of a brown hand holding up a microphone, with some graphical elements highlighting the top of the microphone.' |
| 13 | +--- |
| 14 | + |
| 15 | +It has been an exciting first month for me at Quansight Labs. It's a good time |
| 16 | +for a summary of what we worked on in April and what is coming next. |
| 17 | + |
| 18 | +## Progress on array computing libraries |
| 19 | + |
| 20 | +Our first bucket of activities I'd call "innovation". The most prominent |
| 21 | +projects in this bucket are [XND](https://xnd.io/), |
| 22 | +[uarray](https://uarray.readthedocs.io/en/latest/), |
| 23 | +[metadsl](https://github.com/Quansight-Labs/metadsl), |
| 24 | +[python-moa](https://github.com/Quansight-Labs/python-moa), |
| 25 | +[Remote Backend Compiler](https://github.com/xnd-project/rbc) and |
| 26 | +[arrayviews](https://github.com/xnd-project/arrayviews). |
| 27 | +XND is an umbrella name for a set of related array |
| 28 | +computing libraries: `xnd`, `ndtypes`, `gumath`, and `xndtools`. |
| 29 | + |
| 30 | +Hameer Abbasi made some major steps forward with `uarray`: the backend and |
| 31 | +coercion semantics are now largely worked out, there is |
| 32 | +good [documentation](https://uarray.readthedocs.io/en/latest/), and the |
| 33 | +`unumpy` package (which currently has `numpy`, `XND` and `PyTorch` backends) |
| 34 | +is progressing well. This [blog post](https://labs.quansight.org/blog/2019/04/uarray-intro/) |
| 35 | +gives a good overview of the motivation for `uarray` and its main concepts. |
| 36 | + |
| 37 | +Saul Shanabrook and Chris Ostrouchov worked out how best to put `metadsl` |
| 38 | +and `python-moa` together: `metadsl` can be used to create the API for |
| 39 | +`python-moa` to simplify the code base of the latter a lot. Chris |
| 40 | +also wrote an interesting [blog post](https://labs.quansight.org/blog/2019/04/python-moa-tensor-compiler/) |
| 41 | +explaining the MoA principles. |
| 42 | + |
| 43 | +The work on XND over the last month consisted mostly of "under the hood" |
| 44 | +improvements and fixes in `xnd` and `ndtypes` by Stefan Krah. We did create |
| 45 | +a new [xnd-benchmarks](https://github.com/xnd-project/xnd-benchmarks) repository |
| 46 | +and had some interesting discussions on performance. One thing I learned is that |
| 47 | +XND has automatic multithreading and has very similar performance to NumPy + MKL |
| 48 | +for basic arithmetic operations (at least for array sizes above ~1e4 elements, the |
| 49 | +overhead for small arrays is larger). The `xnd.array` interface, which is a higher |
| 50 | +level interface than `xnd.xnd` and can be used similarly to `numpy`, is taking |
| 51 | +shape as well. One user-visible new feature worth mentioning is that xnd containers |
| 52 | +can now be serialized and pickled. |
| 53 | + |
| 54 | + |
| 55 | +## Work on PyData core projects |
| 56 | + |
| 57 | +Most people in the team are maintainers of or contributors to one or more core |
| 58 | +projects in the PyData or SciPy stacks. Helping maintain and evolve those |
| 59 | +projects is our second bucket of activities. |
| 60 | + |
| 61 | +Aaron Meurer did a lot of work on [SymPy](https://www.sympy.org), both |
| 62 | +maintenance on the SymPy internals and managing the SymPy 1.4 release. He |
| 63 | +wrote a nice blog post on the highlights in that release |
| 64 | +[here](http://labs.quansight.org/blog/2019/04/whats-new-in-sympy-14/). |
| 65 | + |
| 66 | +Gonzalo Pena-Castellanos is working full-time on [Spyder](https://www.spyder-ide.org/), |
| 67 | +with guidance from Carlos Cordoba. Together they have been working very hard to get |
| 68 | +the first beta of Spyder 4 ready. Some exciting new features are also in the |
| 69 | +works, however Gonzalo will be blogging about those soon so I won't steal his |
| 70 | +thunder. |
| 71 | + |
| 72 | +Ivan Ogasawara is spending some time each week on maintenance of |
| 73 | +[Ibis](https://docs.ibis-project.org/). If you're a Pandas or scikit-learn user |
| 74 | +and need to interact with SQL databases or HDFS/Spark, Ibis is worth looking into. |
| 75 | + |
| 76 | +I myself have enjoyed having a little more bandwidth for NumPy and SciPy. |
| 77 | +On the technical front, this allowed me to contribute to the design discussion |
| 78 | +about an [addition](https://mail.python.org/pipermail/numpy-discussion/2019-April/079317.html) |
| 79 | +to NEP 18 (the `__array_function__` override mechanism), |
| 80 | +do the [numpydoc](https://github.com/numpy/numpydoc) 0.9 release, deal |
| 81 | +with several build issues, and review a number of PRs |
| 82 | +(the one |
| 83 | +allowing to specify [BLAS and LAPACK link order](https://github.com/numpy/numpy/pull/13132) |
| 84 | +was particularly nice). On the organizational front, I fixed the description |
| 85 | +of how donations are handled on numpy.org, finalized the |
| 86 | +[Tidelift](https://tidelift.com/) agreement for NumPy (see the |
| 87 | +[announcement](https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html) |
| 88 | +for details), helped NumPy and SciPy get accepted for the |
| 89 | +[Google Season of Docs](https://developers.google.com/season-of-docs/) program, |
| 90 | +and did everything needed to finalize the fiscal sponsorship agreement between |
| 91 | +SciPy and NumFOCUS. |
| 92 | + |
| 93 | +## Jupyter and JupyterLab improvements |
| 94 | + |
| 95 | +Jupyter is a key part of the PyData ecosystem. It extends well beyond that though, so I'm |
| 96 | +giving it its own bucket here. At Quansight we have a number of Jupyter core developers |
| 97 | +and contributors. Ian Rose, Saul Shanabrook, Grant Nestor and others have been very busy |
| 98 | +with both maintenance tasks and adding new features to Jupyter and JupyterLab. |
| 99 | + |
| 100 | +JupyterLab is about to get support for printing (not inside the notebook, but the old-fashioned |
| 101 | +`Ctrl-P` variant). [This pull request](https://github.com/jupyterlab/jupyterlab/pull/5850) |
| 102 | +by Saul has nice screenshots showing the feature in action for whole notebooks, |
| 103 | + images, the JSON viewer and the inspector. |
| 104 | + |
| 105 | +Ian worked on the third alpha release of JupyterLab 1.0, on testing and CI infrastructure, |
| 106 | +and other general maintenance tasks. He also improved PDF preview in JupyterLab, so it |
| 107 | +now [works as expected](https://github.com/jupyterlab/jupyterlab/pull/6264) in Firefox |
| 108 | +and Chrome (at least). |
| 109 | + |
| 110 | +Saul added support for the [nteract Data Explorer](https://data-explorer.nteract.io/) to the JupyterLab data registry as a plugin. |
| 111 | +[This pull request](https://github.com/jupyterlab/jupyterlab-data-explorer/pull/10) shows it |
| 112 | +in action on a pandas DataFrame. |
| 113 | + |
| 114 | +Other interesting features are in progress and will make their way into the main |
| 115 | +repositories soon. |
| 116 | + |
| 117 | +## Starting to shape Labs |
| 118 | + |
| 119 | +There is a lot of work to do to figure out for ourselves exactly what Labs |
| 120 | +will be, and then to communicate that clearly to the outside world. We have |
| 121 | +a rough idea (see [my first blog post](https://labs.quansight.org/blog/2019/04/joining-labs/) |
| 122 | +and [Travis' blog post](https://www.quansight.com/single-post/2019/04/02/Welcoming-Ralf-Gommers-as-Director-of-Quansight-Labs)), but there's a long way |
| 123 | +to go from there to having an compelling elevator pitch, a website that tells |
| 124 | +our story well, people and projects organized, a funding stream, and more. |
| 125 | + |
| 126 | +One of the first things we did do is start this blog, to start communicating |
| 127 | +about the technical work we're doing. We're also going through the roadmaps |
| 128 | +to ensure they're up-to-date and to make clear that those are for _community |
| 129 | +driven projects_ that Quansight is aiming to obtain industry support for. |
| 130 | + |
| 131 | +## Funding |
| 132 | + |
| 133 | +We reached about 20% of our funding goal for 2019 so far, primarily with contributions |
| 134 | +from [DE Shaw](https://www.deshaw.com/), [OmniSci](https://www.omnisci.com/) and |
| 135 | +[TDK](https://www.tdk.com/). |
| 136 | + |
| 137 | +Both DE Shaw and OmniSci are supporting a significant amount of work on |
| 138 | +JupyterLab, which highlights how important Jupyter and JupyterLab have become |
| 139 | +in the data science ecosystem. DE Shaw is also supporting work on projects |
| 140 | +like Dask, Numba and XND that is starting at the moment. OmniSci supports work |
| 141 | +on Ibis and Remote Backend Compiler. Finally, Quansight is working with Cal Poly |
| 142 | +(one of the [Jupyter lead institutes](https://calpolynews.calpoly.edu/news_releases/2018/May/Jupyter), |
| 143 | +together with UC Berkeley) to execute on the Project Jupyter roadmap for JupyterLab. |
| 144 | + |
| 145 | +TDK is sponsoring the Spyder work I talked about above. Supporting both general |
| 146 | +maintenance for the Spyder 4 release and some interesting new features is an |
| 147 | +important contribution that helps the many engineers and scientists that use |
| 148 | +Spyder as their main development and data science interface. |
| 149 | + |
| 150 | +The above is direct funding from companies for work on open source projects. |
| 151 | +Quansight also offers open-source support and consulting, as well as training |
| 152 | +around the PyData stack. Those activities also yield funds that we then use to |
| 153 | +fund the efforts of Quansight Labs. To learn more about those offerings, |
| 154 | +contact Travis ( `[email protected]`), myself ( `[email protected]`) or |
| 155 | + |
| 156 | + |
| 157 | +Besides funding from companies, we are also applying for grants. So far we have |
| 158 | +submitted two proposals to the NSF and three to NASA, on topics ranging from |
| 159 | +JupyterLab extensions for high performance computing to improving Xarray's array |
| 160 | +backend system. For most of these proposals we expect the verdict in the next |
| 161 | +1-2 months. In April we got a rejection from the NSF for a |
| 162 | +[proposal](https://figshare.com/articles/Mid-Scale_Research_Infrastructure_-_The_Scientific_Python_Ecosystem/8009441) |
| 163 | +titled "Accelerated Development of the Scientific Python Ecosystem", which we |
| 164 | +wrote together with NumFOCUS and Columbia, with the latter as lead |
| 165 | +institute (thanks goes especially to Andreas Mueller and Andy Terrel for a lot |
| 166 | +of the hard work on that proposal). The discussions triggered by that |
| 167 | +rejection have been very useful and generated a number of new ideas and |
| 168 | +contacts to follow up on in the coming months. |
| 169 | + |
| 170 | +One idea that came up more than once is to clearly express the needs of these |
| 171 | +projects in public, ideally in fundable chunks and with an effort estimate attached, |
| 172 | +and then approaching both funding bodies and companies with that. This is likely |
| 173 | +to be more effective than responding to solicitations that may not be a perfect |
| 174 | +match. Quansight Labs is positioned well to either participate in or help lead such |
| 175 | +a process, and to work with companies that rely on the PyData stack in particular. |
| 176 | + |
| 177 | +However we look for funding, it will be important to be clear in our messaging |
| 178 | +and transparent with the community about the ways we look for funding. I will be |
| 179 | +actively soliciting feedback on this as well, both via blog posts like these |
| 180 | +(please email me at `[email protected]` if you have ideas, questions or |
| 181 | +concerns!) and in person. |
| 182 | + |
| 183 | +Finally, we are finalizing and signing a preferred partnership with NumFOCUS, |
| 184 | +where 5% of Quansight Labs funds or projects referred from NumFOCUS will be |
| 185 | +provided to NumFOCUS to sustain their efforts. NumFOCUS is an important fundament |
| 186 | +of the PyData ecosystem, and we would like to contribute to keeping it on a sound |
| 187 | +financial footing and growing NumFOCUS further. |
0 commit comments