Skip to content

Commit 79c60d6

Browse files
authored
Merge pull request #124 from reshamas/main
minor wording tweaks
2 parents 5d26b92 + bbf0c96 commit 79c60d6

File tree

1 file changed

+28
-28
lines changed

1 file changed

+28
-28
lines changed

_posts/2022-05-18-sprints-value.md

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,11 @@ Sprints are **working sessions to contribute to an open source library**. The go
2828

2929
## Introduction
3030

31-
The [scikit-learn](https://scikit-learn.org/dev/index.html) project has a long and extraordinary legacy of open source sprints. Since 2010, when its [first public version](https://en.wikipedia.org/wiki/Scikit-learn) was released, there have been as many as [45 sprints organized](https://blog.scikit-learn.org/sprints/). The 45 number is a lower bound, since there are likely more sprints that have not been listed.
31+
The [scikit-learn](https://scikit-learn.org/dev/index.html) project has a long and extraordinary legacy of open source sprints. Since 2010, when its [first public version](https://en.wikipedia.org/wiki/Scikit-learn) was released, there have been as many as [45 sprints organized](https://blog.scikit-learn.org/sprints/). The number 45 is a lower bound, since there are likely more sprints that have not been listed.
3232

3333
To date, more than 2300 people have contributed to [scikit-learn](https://github.com/scikit-learn/scikit-learn). The number of contributors to scikit-learn exceeds those of other related libraries such as numpy, scipy and matplotlib, with the exception of the [pandas](https://github.com/pandas-dev/pandas), which has a greater number of contributors (See Appendix A).
3434

35-
The public discourse on open source has expanded to explore topics of sustainability, funding models, and diversity and inclusion, to name a few. A *reasonable*, yet *”difficult to answer”* question that has been posed is:
35+
The public discourse on open source has expanded to explore topics of sustainability, funding models, and diversity and inclusion, to name a few. A *reasonable*, yet ”difficult to answer” question that has been posed is:
3636
>*<span style="background-color: #CAE9F5;">
3737
What is the effectiveness of sprint models and what is the long-term engagement as a result of these sprints?
3838
</span>*
@@ -41,7 +41,7 @@ What is the effectiveness of sprint models and what is the long-term engagement
4141

4242
Due to technological limitations of GitHub and privacy concerns, we do not hold precise data on how many scikit-learn contributors connected to the project via a sprint. We have no formal data collection process which records statistics on how many sprint participants are recurring or information on their contributions to other open source projects or other long term positive ripple effects. A scientific look at the correlation between the number of sprints and contributors is beyond the scope of this article. What we *will examine* in this article are the **objectives, results and aspirations** of running the scikit-learn sprints.
4343

44-
The queries from other open-source projects requesting guidance on sprints and diversity and inclusions have been increasing. We share these experiences and lessons learned with the community, potential funders and open source project maintainers, particularly those projects which are nascent in their quest to build community, sustainability and diversity and inclusion.
44+
<span style="background-color: #CAE9F5;">The queries from other open-source projects requesting guidance on sprints and diversity and inclusions have been increasing.</span> We share these experiences and lessons learned with the community, potential funders and open source project maintainers, particularly those projects which are nascent in their quest to build community, sustainability and diversity and inclusion.
4545

4646
## Outline
4747

@@ -65,42 +65,42 @@ We distinguish between a Developer (Dev) and Community sprint because the goals
6565

6666
**Developer (Dev) Sprint**
6767

68-
A Developer, or “dev”, sprint is one that is typically organized by the maintainers of the library. A dev sprint is one where the developers or maintainers of the library gather to work on issues and to discuss the resolution of ongoing complex issues. This also provides the team an opportunity to focus on tasks related to the long-term roadmap of the project.
69-
70-
For scikit-learn, the early Community sprints were alongside the [SciPy conferences](https://conference.scipy.org) and the practice has continued for over a decade.
68+
A Developer, or “Dev”, sprint is one that is typically organized by the maintainers of the library. A Dev sprint is one where the developers or maintainers of the library gather to work on issues and to discuss the resolution of ongoing complex issues. This also provides the team an opportunity to focus on tasks related to the long-term roadmap of the project.
7169

7270
The first early Dev sprints were organized at Inria. The first [major Dev sprint](https://github.com/scikit-learn/scikit-learn/wiki/Past-sprints#granada-19th-21th-dec-2011) was held in Granada after the NIPS 2011 conference (now renamed NeurIPS). It was the first time that most of the team had met in real life after months or years of online collaboration, and over a dozen developers participated. Later, Dev sprints were often hosted in the offices of partnering tech companies, typically from 3 to 7 days, once a year, in pre-COVID times.
7371

7472
**Community Sprint**
7573

76-
A Community sprint can be a collaboration by individuals, by affinity communities such as Meetup Groups (Data Umbrella, PyLadies, etc.), by conferences (SciPy, PyData Global, JupyterCon, etc.). A Community sprint is one that is with the general public and it may be beginners, experts, or a combination of both.
74+
A Community sprint can be a collaboration by individuals, by affinity communities such as Meetup Groups (Data Umbrella, PyLadies, etc.), by conferences (SciPy, PyCon, PyData Global, JupyterCon, etc.). A Community sprint is one that is with the general public and it may be beginners, experts, or a combination of both.
75+
76+
For scikit-learn, the early Community sprints were alongside the [SciPy conferences](https://conference.scipy.org) and the practice has continued for over a decade.
7777

7878
At a Developer sprint, a contributor may work on a PR that has been ongoing for three months. Conversely, Community sprints require curated issues which newcomers can complete in a shorter period of time (such as 1 day, or 1 day with 1-2 months follow-up).
7979

80-
The landscape of community sprints with other [scientific python](https://scientific-python.org/calendars/) libraries is unknown.
80+
The landscape of Dev and Community sprints with other [scientific python](https://scientific-python.org/calendars/) libraries is unknown.
8181

8282
## Goals of the Sprints
8383

8484
### Goals of Dev Sprints
8585
- To get maintainers in one room to efficiently discuss open issues and pull requests
8686
- To move along contributions in a synchronous fashion
8787
- To foster existing collaborations with external developers synchronously
88-
- To building rapport: Maintainers reside in various continents and the in-person sprints build rapport within the team. Social interactions are critical in having a productive team
88+
- To build rapport: Maintainers reside in various continents and the in-person sprints build rapport within the team. Social interactions are critical in having a productive team.
8989
- To foster collaborations with the project’s corporate sponsors (members of the [scikit-learn Consortium](https://scikit-learn.org/stable/about.html#funding))
9090

9191
### Goals of Community & Beginner Sprints
9292

9393
- To broaden the project’s contributor base
9494
- To build community and connect the project maintainers with its users
95-
- To get interactive feedback from new scikit-learn users and contributors
95+
- To obtain interactive feedback from new scikit-learn users and contributors
9696
- To onboard new contributors to scikit-learn and PyData generally
9797
- To onboard new contributors who would become recurring contributors
9898
- To collaborate with community groups to increase diversity of contributor base with intentional outreach
9999
- To strengthen and support existing contributors in order to maintain recurring community contributors
100100

101101
## scikit-learn Team Members Who Connected to the Project Via a Sprint
102102

103-
It is notable that a number of the current maintainers of the library found their way to the project via a sprint. Additionally, some members of the Contributor Experience Team also connected to the scikit-learn project via the sprints.
103+
It is notable that a number of the current maintainers of the library found their way to the project via a sprint. Additionally, some members of the Contributor Experience Team connected to the scikit-learn project via the sprints.
104104

105105
### Olivier Grisel
106106

@@ -126,7 +126,7 @@ Olivier shares:
126126
He contributed code, reviews, and documentation since March 2021, joined Inria in April 2021 and in October 2021, Julien became a core developer.
127127

128128
### Other Maintainers
129-
There are [other maintainers](https://scikit-learn.org/dev/about.html#people) and emeritus contributors who had participated in a developer or community sprint along their journey with the scikit-learn team, such as Vlad Nicolae (current maintainer), Gilles Loupe (Emeritus), Thouis (Ray) Jones (Emeritus).
129+
There are [other maintainers](https://scikit-learn.org/dev/about.html#people) and emeritus contributors who had participated in a Developer or Community sprint along their journey with the scikit-learn team, such as Vlad Nicolae (current maintainer), Gilles Loupe (Emeritus), Thouis (Ray) Jones (Emeritus).
130130

131131
### Reshama Shaikh
132132
[Reshama Shaikh](https://github.com/reshamas) has organized nine scikit-learn [community sprints](https://www.dataumbrella.org/sprints) from 2017 to 2021. She first contributed code and documentation fixes to scikit-learn in September 2018. In September 2020, she was invited to join the scikit-learn team.
@@ -159,7 +159,7 @@ Users learn a range of tools such as: virtual environment setup, version control
159159

160160
**Overcoming barriers to entry**
161161

162-
The sprints, as a “hands-on working session”, provides an avenue for potential contributors to overcome common barriers to entry, particularly “getting started”, and moving from the *possibility* to an *actuality* stage.
162+
The sprints, as a “hands-on working session”, provide an avenue for potential contributors to overcome common barriers to entry, particularly “getting started”, and moving from the *possibility* to an *actuality* stage.
163163

164164
**Providing an avenue for advanced contributions**
165165

@@ -192,41 +192,41 @@ These have been the observed benefits of the online sprints, which began in 2020
192192

193193
**Networking**
194194

195-
Sprints make it easier to meet new people with different backgrounds, and in particular, online sprints help break geographical barriers.
195+
Online sprints make it easier to meet new people with different backgrounds.
196196

197197
**International collaboration**
198198

199-
Collaborating with affinity communities can attract more candidates from various backgrounds.
199+
Collaborating with affinity communities can attract more candidates from various backgrounds. In particular, online sprints help break geographical barriers.
200200

201201
**Pair programming**
202202

203-
The pairing of contributors seems to work well. Pair programming was consistently ranked as a positive experience by online sprint participants
203+
The pairing of contributors seems to work well. Pair programming was consistently ranked as a positive experience by online sprint participants.
204204

205205
**Increases accessibility**
206206

207-
The use of online tools in particular makes it possible to interact with people
208-
who would not have joined traditional community events organized in
207+
The use of online tools makes it possible to interact with people
208+
who would not have joined community events traditionally organized in
209209
North America or western Europe e.g. because of the travel costs and
210-
complexity to get a visa in time. Attending those online events is probably also less disruptive for people with young children.
210+
complexity of obtaining a visa in time. Attending the online events is probably also less disruptive for people with young children.
211211

212212
For the scikit-learn project itself, it made it possible to "recruit" a couple of new recurring contributors who attend regular office hours after the original sprints.
213213

214214
**Office Hours**
215215

216-
Actually the fact that we now have community office hours on discord is probably a consequence of us attending the Data Umbrella online sprints.
216+
Actually the fact that we now have community office hours on Discord is probably a consequence of us attending the Data Umbrella online sprints.
217217

218218
Olivier shares:
219219
>I think they [the sprints] were the most interesting online events I attended during
220220
the COVID-19 crisis when all traditional on-site tech events were canceled. In particular the active planning by the Data Umbrella team for participants to work in pairs with audio rooms on Discord + a central help desk audio room worked really well.
221221

222222
>The pre-sprint and post-sprint office hours also made it possible to limit the time spent on helping fix setup issues compared to what we experience in traditional sprints. They also forced us as maintainers to review and fix our documentation before the event.
223223
224-
**Creation of supplementary resources in various medium forms**
224+
**Creation of supplementary resources in different media types**
225225

226-
Data Umbrella coordinated the creation of a series of videos and transcripts that provided learning materials for the community to prepare for the sprint. These resources were available to the public and have a wide reach:
226+
Data Umbrella coordinated the creation of a series of videos and transcripts that provided learning materials for the community to prepare for the sprint. These resources are available to the public and have a wide reach:
227227

228228
This is the [Contributing to scikit-learn](https://www.youtube.com/playlist?list=PLBKcU7Ik-ir-b1fwjNabO3b8ebs9ez5ga
229-
) list of videos that were created for the sprints.
229+
) list of videos that were created for the sprints:
230230
- Andreas Mueller: [Crash Course in Contributing to scikit-learn](https://youtu.be/5OL8XoMMOfA)
231231
- Reshama Shaikh: [Example of scikit-learn Pull Request](https://youtu.be/PU1WyDPGePI)
232232
- Andreas Mueller: [Sprint FAQs](https://youtu.be/p_2Uw2BxdhA)
@@ -246,7 +246,7 @@ This is the [Contributing to scikit-learn](https://www.youtube.com/playlist?list
246246
<span style="background-color: #CAE9F5;">
247247
One of the primary goals of the Community sprints was to onboard new contributors who would become recurring contributors. This goal has generally not been realized. scikit-learn is a complex and advanced project, and a one-time sprint does not provide sufficient opportunity and support to sprint participants to become recurring contributors.</span> A few sprint participants have progressed to become returning contributors, and it is a very small number relative to the number of sprint participants.
248248

249-
Onboarding a first-time contributor takes time. People who are contributing for the first time need to go through a lot of information simultaneously regarding both technical and organizational aspects of contributions. People may run into unexpected issues at the really start depending on their
249+
Onboarding a first-time contributor takes time. People who are contributing for the first time need to go through a lot of information simultaneously regarding both technical and organizational aspects of contributions. People may run into unexpected issues at the start depending on their
250250
setup and experience, might get frustrated and or discouraged and might not
251251
report the problem they are having (thinking it is their fault). Pre-event office hours have been successful at alleviating some of these roadblocks, for those sprint participants who have completed their pre-work.
252252

@@ -259,7 +259,7 @@ Here are some adjustments that can be made in the future to reach the goal of re
259259
- Have smaller sprint events
260260

261261
**Mentoring**
262-
Sprints may not be sufficient for onboarding people. Mentoring is needed to take to the next level. Mentoring relationships can be established during sprint events.
262+
Sprints may not be sufficient for onboarding people. Mentoring is needed to take to the next level, and mentoring relationships can be established during sprint events.
263263

264264
**Improve the onboarding process**
265265

@@ -332,8 +332,8 @@ A comparison of the contributor base to other related libraries in the same spac
332332

333333
## References
334334

335-
- [Interview with Maren Westermann: Extending the Impact of the scikit-learn Sprints to the Community](https://blog.dataumbrella.org/mwestermann-sprints-experience)
336-
- [Interview with scikit-learn Triage Team Member: Juan Martín Loyola](https://blog.dataumbrella.org/jmloyola-opensource-experience)
335+
- [Behind the Scenes: What It Takes to Run Data Umbrella’s scikit-learn Open Source Sprints](https://eventfund.codeforscience.org/behind-the-scenes-what-it-takes-to-run-data-umbrellas-scikit-learn-open-source-sprints/)
337336
- Data Umbrella [sprint reports](https://blog.dataumbrella.org/tags/#sprint-report)
338337
- Data Umbrella community [sprint blogs](https://blog.dataumbrella.org/tags/#sprint-blog)
339-
338+
- [Interview with Maren Westermann: Extending the Impact of the scikit-learn Sprints to the Community](https://blog.dataumbrella.org/mwestermann-sprints-experience)
339+
- [Interview with scikit-learn Triage Team Member: Juan Martín Loyola](https://blog.dataumbrella.org/jmloyola-opensource-experience)

0 commit comments

Comments
 (0)