You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Paris: [core sprint, for advanced contributors](https://scikit-learn.fondation-inria.fr/en/scikit-learn-sprint-in-paris/) (Feb)
37
+
- Paris, France: [core sprint, for advanced contributors](https://scikit-learn.fondation-inria.fr/en/scikit-learn-sprint-in-paris/) (Feb)
35
38
- 2018
36
-
-WiMLDS: [New York City](https://reshamas.github.io/highlights-from-the-2018-NYC-WiMLDS-scikit-sprint) (Sep)
37
-
-SciPy: [Austin](http://gael-varoquaux.info/programming/sprint-on-scikit-learn-in-paris-and-austin.html) (open sprint, for new contributors) (Jul)
38
-
- Paris: core sprint, for advanced contributors (Jul)
39
-
-Two Sigma: [New York City](https://twitter.com/amuellerml/status/1007670849774784512) (Jun)
40
-
-UC Berkeley: [Berkeley](https://github.com/scikit-image/scikit-image/wiki/UC-Berkeley-(BIDS)-sprint,-May-28-Jun-2-2018)(May 28 to Jun 2)
41
-
-ManAHL: London (April 21-22, 2018)
39
+
- New York, NY: [NYC WiMLDS](https://reshamas.github.io/highlights-from-the-2018-NYC-WiMLDS-scikit-sprint) (Sep 2018)
40
+
-Austin, TX: [SciPy](http://gael-varoquaux.info/programming/sprint-on-scikit-learn-in-paris-and-austin.html) (open sprint, for new contributors) (Jul 2018)
41
+
- Paris, France: core sprint, for advanced contributors (Jul 2018)
42
+
-New York, NY: [Two Sigma](https://twitter.com/amuellerml/status/1007670849774784512) (Jun 2018)
43
+
- Berkeley, CA: [UC Berkeley](https://github.com/scikit-image/scikit-image/wiki/UC-Berkeley-(BIDS)-sprint,-May-28-Jun-2-2018)(May 28 to Jun 2)
Copy file name to clipboardExpand all lines: _posts/2022-07-13-sprints-value.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ Sprints are **working sessions to contribute to an open source library**. The go
30
30
31
31
The [scikit-learn](https://scikit-learn.org/dev/index.html) project has a long and extraordinary legacy of open source sprints. Since 2010, when its [first public version](https://en.wikipedia.org/wiki/Scikit-learn) was released, there have been as many as [45 sprints organized](https://blog.scikit-learn.org/sprints/). The number 45 is a lower bound, since there are likely more sprints that have not been listed.
32
32
33
-
To date, more than 2400 people have contributed to [scikit-learn](https://github.com/scikit-learn/scikit-learn). The number of contributors to scikit-learn exceeds those of other related libraries such as numpy, scipy and matplotlib, with the exception of the [pandas](https://github.com/pandas-dev/pandas), which has a greater number of contributors (See Appendix A).
33
+
To date, more than 2400 people have contributed to [scikit-learn](https://github.com/scikit-learn/scikit-learn). The number of contributors to scikit-learn exceeds those of other related libraries such as numpy, scipy and matplotlib, with the exception of [pandas](https://github.com/pandas-dev/pandas), which has a greater number of contributors (See Appendix A).
34
34
35
35
The public discourse on open source has expanded to explore topics of sustainability, funding models, and diversity and inclusion, to name a few. A *reasonable*, yet *difficult to answer* question that has been posed is:
36
36
>*<spanstyle="background-color: #CAE9F5;">
@@ -41,7 +41,7 @@ What is the effectiveness of sprint models and what is the long-term engagement
41
41
42
42
Due to technological limitations of GitHub and privacy concerns, we do not hold precise data on how many scikit-learn contributors connected to the project via a sprint. We have no formal data collection process which records statistics on how many sprint participants are recurring or information on their contributions to other open source projects or other long term positive ripple effects. A scientific look at the correlation between the number of sprints and contributors is beyond the scope of this article. What we *will examine* in this article are the **objectives, results and aspirations** of running the scikit-learn sprints.
43
43
44
-
<spanstyle="background-color: #CAE9F5;">The queries from other open-source projects requesting guidance on sprints and diversity and inclusions have been increasing.</span> We share these experiences and lessons learned with the community, potential funders and open source project maintainers, particularly those projects which are nascent in their quest to build community, sustainability and diversity and inclusion.
44
+
<spanstyle="background-color: #CAE9F5;">The queries from other open-source projects requesting guidance on sprints and diversity and inclusion have been increasing.</span> We share these experiences and lessons learned with the community, potential funders and open source project maintainers, particularly those projects which are nascent in their quest to build community, sustainability and diversity and inclusion.
- September 27, 2022 - **Pre-sprint** - 10:00 to 12:00 hs (UTC -3)
29
+
- September 28, 2022 - **Sprint** - 10:00 to 17:00 hs (UTC -3)
30
+
31
+
## Repository
32
+
For more information in Spanish, [check this repository](https://github.com/jmloyola/sklearn-sprint-argentina-2022).
33
+
You will find details about the event, instructions to set up the development environment, links with further information and tutorials, and an example git workflow to make a pull request for the project.
34
+
35
+
## Photos
36
+
<figure>
37
+
<imgsrc="/assets/images/posts_images/sprint-salta-2022-1.jpg"alt="11 people standing behind some computers and 2 people projected in the screen"max-width="20%"max-height="20%" />
38
+
<figcaption>
39
+
Group photo of the SciPy Latin America sprint, Salta, Argentina, 2022. Sandra Meneses and Juan Martín Loyola are projected on the screen from a Zoom call. Photo credit: Lucía Torres.
40
+
</figcaption>
41
+
</figure>
42
+
43
+
<figure>
44
+
<imgsrc="/assets/images/posts_images/sprint-salta-2022-2.jpeg"alt="11 people coding in their computers"max-width="20%"max-height="20%" />
45
+
<figcaption>
46
+
Participants of the SciPy Latin America sprint working on their computers. Photo credit: Ariel Silvio Norberto Ramos.
47
+
</figcaption>
48
+
</figure>
49
+
50
+
## Acknowledgment
51
+
These people made this sprint possible:
52
+
- Ariel Silvio Norberto Ramos, one of the organizers of the SciPy Latin America,
53
+
-[Data Umbrella](https://www.dataumbrella.org/), [one of the community partners of the event](https://twitter.com/ScipyLA/status/1573710649963724802), especially Sandra Meneses and Reshama Shaikh,
[Hugging Face](hf.co) is happy to announce that we're partnering with [scikit-learn](https://scikit-learn.org/stable/index.html) to further our support of the machine learning tools and ecosystem.
30
+
31
+
At Hugging Face, we've been putting a lot of effort into supporting deep learning, but we believe that machine learning as a whole can benefit from the tools we release. With statistical machine learning being essential in this field and scikit-learn dominating statistical ML, we're excited to partner and move forward together.
32
+
33
+
As of September 2022, the Hugging Face Hub already hosts nearly 4,000 tabular classification and tabular regression model checkpoints, and we strive for this trend to continue.
Starting June 2022, Hugging Face is now an official sponsor of the scikit-learn consortium . Through this support, Hugging Face actively promotes the development and sustainability of sklearn. As a sponsor of the scikit-learn consortium hosted at the Inria foundation, we'll now participate in the scikit-learn consortium technical committee
44
+
45
+
## Development support
46
+
To help sustaining the development of the library , we're happy to welcome Adrin Jalali and Benjamin Bossan to the Hugging Face team. Adrin is a core developer of scikit-learn as well as [fairlearn](fairlearn.org), while Benjamin is the author of the [skorch](https://github.com/skorch-dev/skorch) library and is now a contributor to scikit-learn.
47
+
48
+
Hugging Face is happy to support the development of scikit-learn through code contributions, issues, pull requests, reviews, and discussions.
49
+
50
+
## Integration to and from the Hugging Face Hub
51
+
52
+
["Skops"](https://github.com/skops-dev/skops) is the name of the framework being actively developed as the link between the scikit-learn and the Hugging Face ecosystems. With Skops, we hope to facilitate essential workflows:
53
+
54
+
- The ability to push scikit-learn models on the Hugging Face Hub
55
+
- The possibility to try out models directly in the browser
56
+
- The automatic creation of model cards, to improve model documentation and understanding
57
+
- The ability to collaborate with others on machine learning projects
58
+
59
+
### Snapshot of your work
60
+
61
+
Working at the intersection of scikit-learn and the Hub offers challenges linked to the two platforms. One of these challenges is secure persistence: the ability to serialize models in a secure, safe manner.
62
+
63
+
scikit-learn models (estimators, predictors, ...) are usually saved using pickle, which is notorious for not being a secure format. Sharing scikit-learn models in this format exposes receivers to potentially malicious data which could execute arbitrary code when run.
64
+
65
+
That's where secure persistence comes in: as the Hugging Face Hub aims to provide a platform for models, the ability to share safe, secure objects is essential. We've been working on adding secure persistence for scikit-learn models in [skops#128](https://github.com/skops-dev/skops/pull/128) and [skops#145](https://github.com/skops-dev/skops/pull/145)([doc preview](https://skops--145.org.readthedocs.build/en/145/persistence.html)). Instead of serializing using pickle, the object's contents are put into a zip file with an accompanying schema JSON file.
66
+
67
+
Read about the Skops library in the following blog post: [Introducing Skops](https://huggingface.co/blog/skops).
68
+
69
+
## Improving interoperability
70
+
71
+
Skops is an example of an integration of scikit-learn within our tools, but it is not the only example! We will strive to integrate with the rest of our ecosystem so that Hugging Face users may benefit from using scikit-learn tools and vice-versa.
72
+
73
+
An example is the `evaluate` library, dedicated to efficiently evaluating machine learning models and datasets. We aim for this tool to natively support [scikit-learn metrics](https://github.com/huggingface/evaluate/issues/297) in its API.
74
+
75
+
---
76
+
77
+
Through these efforts, we hope to kickstart a lasting relationship between the two ecosystems and provide simple, efficient bridges to lower the barrier of entry. We believe that educating and sharing models is the best way to foster inclusive machine learning from which all can benefit. We're excited to partner with scikit-learn for this endeavor.
0 commit comments