Skip to content

Commit 5a04389

Browse files
committed
Merge branch 'main' of github.com:scikit-learn/blog into main
2 parents 6e92950 + c248dc6 commit 5a04389

File tree

2 files changed

+17
-13
lines changed

2 files changed

+17
-13
lines changed

_posts/2022-05-18-sprints-value.md renamed to _posts/2022-07-13-sprints-value.md

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
22
title: "The Value of Open Source Sprints, the scikit-learn Experience"
3-
date: May 18, 2022
3+
date: July 13, 2022
44
categories:
55
- Events
66
tags:
77
- Open Source
88
- Sprints
99
- Community
10-
featured-image: sprints-value.png
10+
featured-image: sprints-value2.png
1111

1212
postauthors:
1313
- name: Reshama Shaikh
@@ -30,9 +30,9 @@ Sprints are **working sessions to contribute to an open source library**. The go
3030

3131
The [scikit-learn](https://scikit-learn.org/dev/index.html) project has a long and extraordinary legacy of open source sprints. Since 2010, when its [first public version](https://en.wikipedia.org/wiki/Scikit-learn) was released, there have been as many as [45 sprints organized](https://blog.scikit-learn.org/sprints/). The number 45 is a lower bound, since there are likely more sprints that have not been listed.
3232

33-
To date, more than 2300 people have contributed to [scikit-learn](https://github.com/scikit-learn/scikit-learn). The number of contributors to scikit-learn exceeds those of other related libraries such as numpy, scipy and matplotlib, with the exception of the [pandas](https://github.com/pandas-dev/pandas), which has a greater number of contributors (See Appendix A).
33+
To date, more than 2400 people have contributed to [scikit-learn](https://github.com/scikit-learn/scikit-learn). The number of contributors to scikit-learn exceeds those of other related libraries such as numpy, scipy and matplotlib, with the exception of the [pandas](https://github.com/pandas-dev/pandas), which has a greater number of contributors (See Appendix A).
3434

35-
The public discourse on open source has expanded to explore topics of sustainability, funding models, and diversity and inclusion, to name a few. A *reasonable*, yet difficult to answer question that has been posed is:
35+
The public discourse on open source has expanded to explore topics of sustainability, funding models, and diversity and inclusion, to name a few. A *reasonable*, yet *difficult to answer* question that has been posed is:
3636
>*<span style="background-color: #CAE9F5;">
3737
What is the effectiveness of sprint models and what is the long-term engagement as a result of these sprints?
3838
</span>*
@@ -144,6 +144,8 @@ There are [other maintainers](https://scikit-learn.org/dev/about.html#people) an
144144

145145
In her PyConDE PyData Berlin keynote from April 2022, [5 Years, 10 Sprints, a scikit-learn Open Source Journey](https://blog.dataumbrella.org/pyconde-keynote-reshama), she shares a history and progression of the Community sprints.
146146

147+
<iframe width="560" height="315" src="https://www.youtube.com/embed/ZUqJaCWPvmk" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
148+
147149
### Juan Martín Loyola
148150
[Juan Martín Loyola](https://github.com/jmloyola) started [contributing to scikit-learn](https://blog.scikit-learn.org/team/jml-interview/) as preparation for the [Data Umbrella Latin America, June 2021](https://blog.dataumbrella.org/data-umbrella-afme2-2021-scikit-learn-sprint-report ) sprint. He continued to contribute prolifically after the sprint, and he was invited to join the team in December 2021. Given his location in Argentina, he will be providing support at the [2022 SciPy Latin America](https://www.scipy.lat/es/scipycon.html) sprint.
149151

@@ -189,14 +191,14 @@ The sprints are a forum for users to gain a greater understanding of how an open
189191

190192
**Value of synchronous interaction**
191193

192-
Typically, open source contributions to scikit-learn occur on the GitHub repository in asynchronous fashion. The sprints provide real-time synchronous interaction. This experience provides more direct access to technical assistance and feedback to the contributor, and in a direct, efficient, and time-saving manner.
194+
Typically, open source contributions to scikit-learn occur on the GitHub repository in asynchronous fashion, over several weeks or months. The sprints provide real-time synchronous interaction. This experience provides more direct access to technical assistance and feedback to the contributor, which is more efficient and engagin.
193195

194196
Julien shares:
195197
>I think having a setup like this [beginner/community sprint] is valuable for first time contributors because they can synchronously get specific information they would hardly have got otherwise. To me, this allow giving feedback which is immediate, specific and exact, making contributing to open-source enjoyable and preventing frustration: giving such feedback is what we should aim for and in this regard this setup is convenient.
196198
197199
### Online Sprints
198200

199-
Since the start of the pandemic, Data Umbrella organized [4 online sprints](https://blog.dataumbrella.org/tags/#sprint-report). Additionally, there were 2 online sprints with [SciPy](https://www.scipy2020.scipy.org/sprints-schedule) and [EuroPython](https://wiki.python.org/moin/EuroPython2020/Sprints).
201+
Since the start of the pandemic, Data Umbrella has organized [4 online sprints](https://blog.dataumbrella.org/tags/#sprint-report). Additionally, there were 2 online sprints with [SciPy](https://www.scipy2020.scipy.org/sprints-schedule) and [EuroPython](https://wiki.python.org/moin/EuroPython2020/Sprints).
200202

201203
These have been the observed benefits of the online sprints, which began in 2020 due to the global pandemic:
202204

@@ -223,9 +225,11 @@ For the scikit-learn project itself, it made it possible to "recruit" a couple o
223225

224226
**Office Hours**
225227

226-
Actually the fact that we now have community office hours on Discord is probably a consequence of us attending the Data Umbrella online sprints.
228+
The scikit-learn project has regular office hours which are hosted on Discord.
227229

228230
Olivier shares:
231+
>Actually the fact that we now have community office hours on Discord is probably a consequence of us attending the Data Umbrella online sprints.
232+
229233
>I think they [the sprints] were the most interesting online events I attended during
230234
the COVID-19 crisis when all traditional on-site tech events were canceled. In particular the active planning by the Data Umbrella team for participants to work in pairs with audio rooms on Discord + a central help desk audio room worked really well.
231235

@@ -260,7 +264,7 @@ Onboarding a first-time contributor takes time. People who are contributing for
260264
setup and experience, might get frustrated and or discouraged and might not
261265
report the problem they are having (thinking it is their fault). Pre-event office hours have been successful at alleviating some of these roadblocks, for those sprint participants who have completed their pre-work.
262266

263-
Here are some adjustments that can be made in the future to reach the goal of recruiting recurring contributors:
267+
Here are some adjustments that can be made in the future to reach the goal of recruiting recurring contributors:
264268
- Provide mentoring
265269
- Improve onboarding process
266270
- Improve issues definitions
@@ -333,12 +337,12 @@ There are additional resources for contributing:
333337

334338
## Appendix A: GitHub Contributors Comparison of Libraries
335339

336-
A comparison of the contributor base to other related libraries in the same space (May 2022):
337-
- [pandas](https://github.com/pandas-dev/pandas): ~2560
338-
- [scikit-learn](https://github.com/scikit-learn/scikit-learn): ~ 2300 contributors
339-
- [numpy](https://github.com/numpy/numpy): ~ 1300 contributors
340+
A comparison of the contributor base to other related libraries in the same space (updated July 2022):
341+
- [pandas](https://github.com/pandas-dev/pandas): ~2600
342+
- [scikit-learn](https://github.com/scikit-learn/scikit-learn): ~2400 contributors
343+
- [numpy](https://github.com/numpy/numpy): ~1300 contributors
340344
- [matplotlib](https://github.com/matplotlib/matplotlib): ~1150
341-
- [scipy](https://github.com/scipy/scipy): ~1120
345+
- [scipy](https://github.com/scipy/scipy): ~1170
342346

343347
## References
344348

465 KB
Loading

0 commit comments

Comments
 (0)