You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2022-07-13-sprints-value.md
+17-13Lines changed: 17 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,13 +1,13 @@
1
1
---
2
2
title: "The Value of Open Source Sprints, the scikit-learn Experience"
3
-
date: May 18, 2022
3
+
date: July 13, 2022
4
4
categories:
5
5
- Events
6
6
tags:
7
7
- Open Source
8
8
- Sprints
9
9
- Community
10
-
featured-image: sprints-value.png
10
+
featured-image: sprints-value2.png
11
11
12
12
postauthors:
13
13
- name: Reshama Shaikh
@@ -30,9 +30,9 @@ Sprints are **working sessions to contribute to an open source library**. The go
30
30
31
31
The [scikit-learn](https://scikit-learn.org/dev/index.html) project has a long and extraordinary legacy of open source sprints. Since 2010, when its [first public version](https://en.wikipedia.org/wiki/Scikit-learn) was released, there have been as many as [45 sprints organized](https://blog.scikit-learn.org/sprints/). The number 45 is a lower bound, since there are likely more sprints that have not been listed.
32
32
33
-
To date, more than 2300 people have contributed to [scikit-learn](https://github.com/scikit-learn/scikit-learn). The number of contributors to scikit-learn exceeds those of other related libraries such as numpy, scipy and matplotlib, with the exception of the [pandas](https://github.com/pandas-dev/pandas), which has a greater number of contributors (See Appendix A).
33
+
To date, more than 2400 people have contributed to [scikit-learn](https://github.com/scikit-learn/scikit-learn). The number of contributors to scikit-learn exceeds those of other related libraries such as numpy, scipy and matplotlib, with the exception of the [pandas](https://github.com/pandas-dev/pandas), which has a greater number of contributors (See Appendix A).
34
34
35
-
The public discourse on open source has expanded to explore topics of sustainability, funding models, and diversity and inclusion, to name a few. A *reasonable*, yet ”difficult to answer” question that has been posed is:
35
+
The public discourse on open source has expanded to explore topics of sustainability, funding models, and diversity and inclusion, to name a few. A *reasonable*, yet *difficult to answer* question that has been posed is:
36
36
>*<spanstyle="background-color: #CAE9F5;">
37
37
What is the effectiveness of sprint models and what is the long-term engagement as a result of these sprints?
38
38
</span>*
@@ -144,6 +144,8 @@ There are [other maintainers](https://scikit-learn.org/dev/about.html#people) an
144
144
145
145
In her PyConDE PyData Berlin keynote from April 2022, [5 Years, 10 Sprints, a scikit-learn Open Source Journey](https://blog.dataumbrella.org/pyconde-keynote-reshama), she shares a history and progression of the Community sprints.
146
146
147
+
<iframewidth="560"height="315"src="https://www.youtube.com/embed/ZUqJaCWPvmk"title="YouTube video player"frameborder="0"allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"allowfullscreen></iframe>
148
+
147
149
### Juan Martín Loyola
148
150
[Juan Martín Loyola](https://github.com/jmloyola) started [contributing to scikit-learn](https://blog.scikit-learn.org/team/jml-interview/) as preparation for the [Data Umbrella Latin America, June 2021](https://blog.dataumbrella.org/data-umbrella-afme2-2021-scikit-learn-sprint-report) sprint. He continued to contribute prolifically after the sprint, and he was invited to join the team in December 2021. Given his location in Argentina, he will be providing support at the [2022 SciPy Latin America](https://www.scipy.lat/es/scipycon.html) sprint.
149
151
@@ -189,14 +191,14 @@ The sprints are a forum for users to gain a greater understanding of how an open
189
191
190
192
**Value of synchronous interaction**
191
193
192
-
Typically, open source contributions to scikit-learn occur on the GitHub repository in asynchronous fashion. The sprints provide real-time synchronous interaction. This experience provides more direct access to technical assistance and feedback to the contributor, and in a direct, efficient, and time-saving manner.
194
+
Typically, open source contributions to scikit-learn occur on the GitHub repository in asynchronous fashion, over several weeks or months. The sprints provide real-time synchronous interaction. This experience provides more direct access to technical assistance and feedback to the contributor, which is more efficient and engagin.
193
195
194
196
Julien shares:
195
197
>I think having a setup like this [beginner/community sprint] is valuable for first time contributors because they can synchronously get specific information they would hardly have got otherwise. To me, this allow giving feedback which is immediate, specific and exact, making contributing to open-source enjoyable and preventing frustration: giving such feedback is what we should aim for and in this regard this setup is convenient.
196
198
197
199
### Online Sprints
198
200
199
-
Since the start of the pandemic, Data Umbrella organized [4 online sprints](https://blog.dataumbrella.org/tags/#sprint-report). Additionally, there were 2 online sprints with [SciPy](https://www.scipy2020.scipy.org/sprints-schedule) and [EuroPython](https://wiki.python.org/moin/EuroPython2020/Sprints).
201
+
Since the start of the pandemic, Data Umbrella has organized [4 online sprints](https://blog.dataumbrella.org/tags/#sprint-report). Additionally, there were 2 online sprints with [SciPy](https://www.scipy2020.scipy.org/sprints-schedule) and [EuroPython](https://wiki.python.org/moin/EuroPython2020/Sprints).
200
202
201
203
These have been the observed benefits of the online sprints, which began in 2020 due to the global pandemic:
202
204
@@ -223,9 +225,11 @@ For the scikit-learn project itself, it made it possible to "recruit" a couple o
223
225
224
226
**Office Hours**
225
227
226
-
Actually the fact that we now have community office hours on Discord is probably a consequence of us attending the Data Umbrella online sprints.
228
+
The scikit-learn project has regular office hours which are hosted on Discord.
227
229
228
230
Olivier shares:
231
+
>Actually the fact that we now have community office hours on Discord is probably a consequence of us attending the Data Umbrella online sprints.
232
+
229
233
>I think they [the sprints] were the most interesting online events I attended during
230
234
the COVID-19 crisis when all traditional on-site tech events were canceled. In particular the active planning by the Data Umbrella team for participants to work in pairs with audio rooms on Discord + a central help desk audio room worked really well.
231
235
@@ -260,7 +264,7 @@ Onboarding a first-time contributor takes time. People who are contributing for
260
264
setup and experience, might get frustrated and or discouraged and might not
261
265
report the problem they are having (thinking it is their fault). Pre-event office hours have been successful at alleviating some of these roadblocks, for those sprint participants who have completed their pre-work.
262
266
263
-
Here are some adjustments that can be made in the future to reach the goal of recruiting recurring contributors:
267
+
Here are some adjustments that can be made in the future to reach the goal of recruiting recurring contributors:
264
268
- Provide mentoring
265
269
- Improve onboarding process
266
270
- Improve issues definitions
@@ -333,12 +337,12 @@ There are additional resources for contributing:
333
337
334
338
## Appendix A: GitHub Contributors Comparison of Libraries
335
339
336
-
A comparison of the contributor base to other related libraries in the same space (May 2022):
0 commit comments