Skip to content

Commit 899d986

Browse files
committed
Fix: feedback from @NickleDave and @billbrod
1 parent 787befc commit 899d986

File tree

1 file changed

+59
-26
lines changed

1 file changed

+59
-26
lines changed

tutorials/intro.md

Lines changed: 59 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,10 @@ GitHub and GitLab also provide continuous integration and continuous deployment
161161
**An example of Continuous deployment:**
162162
* When you are ready to release your package to PyPI, a continuous deployment operation might be triggered on release to publish your package to PyPI.
163163

164-
Integrated CI/CD will help you maintain your software ensuing that changes to the code don't break things unexpectedly and also maintain a style and format consistency.
164+
Integrated CI/CD will help you maintain your software, ensuring that
165+
changes to the code don't break things unexpectedly. They can also
166+
help you maintain code style and format consistency for every new
167+
change to your code.
165168

166169
:::{figure-md} packaging-workflow
167170

@@ -170,43 +173,36 @@ Integrated CI/CD will help you maintain your software ensuing that changes to th
170173
The lifecycle of a scientific Python package.
171174
:::
172175

173-
## What should code in a Python package look like?
176+
## When should you turn your code into a Python package?
174177

175-
Ideally the code in your Python package is general. This means it
176-
can be used on different data or for different scientific applications. An example
177-
of a package that is written in a generalized way is matplotlib.
178+
You may be wondering, what types of code should become a Python package that is both on GitHub and published to PyPI and/or conda-forge.
178179

179-
matplotlib does
180-
one (big important) thing really well:
180+
There are a few use cases to consider:
181181

182-
*It creates visual plots of data.*
182+
1. **Creating a basic package for yourself:** Sometimes you want create a package for your own personal use. This might mean making your code locally pip installable and you may also want to publish it to GitHub. In that case you don't expect others to use your code, and as such you may only have documentation for you and your future self if you need to update the package.
183183

184-
Matplotlib is used by thousands of users for different plotting applications
185-
using different types of data. While few scientific packages will have the same
186-
broad application as tools like matplotlib or NumPy, the
187-
idea of code being used for something more than a single workflow still applies
188-
to package development if you want other people to use your package.
184+
> An example of this type of package might be a set of functions that you write that are useful across several of your projects. It could be useful to have those functions available to all of your projects.
189185
190-
### Code should also be clean & readable & documented
186+
:::{todo}
187+
LINK to pip installable lesson when it's published - it's in review now
188+
:::
191189

192-
The code in your package should also be clean, readable and well documented.
190+
2. In other cases, you may create some code that you soon realize might also be useful to not just you, but to other people as well.
191+
In that case, you might consider both creating the package, publishing it on GitHub, and because other users may be using it, you may make user of GitHub's infrastructure including CI/CD pipelines, issue trackers. Because you want other people to use your package, you will want to also include LICENSE information, documentation for users and contributors and tests. This type of package is most often published to PyPI.
193192

194-
**Clean code:** Clean code refers to code that uses expressive variable names,
195-
is concise and does not repeat itself. We will dive deeper into best practices
196-
for clean code in future pyOpenSci tutorials.
193+
For example, all of the [pyOpenSci packages](https://www.pyopensci.org/python-packages.html) are public facing with an intended audience beyond just the maintainers.
197194

198-
**Readable code:** Readable code is code written with a consistent style.
199-
You can use linters and code formatters such as black and flake8 to ensure
200-
this consistency throughout your entire package. [Learn more about code formatters here.](../package-structure-code/code-style-linting-format.html)
195+
### Packages that you expect others to use should be well-scoped
196+
197+
Ideally the code in your Python package is focused on a specific theme or use case. This theme is important as it's a way to scope the content of your package.
198+
199+
It can be tricky to decide when your code becomes something that might be more broadly useful to others. But one question you can ask yourself is - is your code written specifically for a single research project? Or could it have a broader application across multiple projects in your domain?
201200

202-
**Documented code:** documented code is written using docstrings that help a
203-
user understand both what the functions and methods in your code does and also
204-
what the input and output elements of each function is. [You can learn more about docstrings in our guide, here.](../documentation/write-user-documentation/document-your-code-api-docstrings)
205201

206-
:::{admonition} Where do research compendia fit in?
202+
:::{admonition} How does this relate to code for a research project?
207203
:class: note
208204

209-
A Research Compendium is an organized set of code, data and documentation that
205+
A [Research Compendium](https://the-turing-way.netlify.app/reproducible-research/compendia.html#basic-compendium) is an organized set of code, data and documentation that
210206
supports a specific research project. It aims to enhance the reproducibility and
211207
transparency of research by providing a comprehensive record of the methods,
212208
data, and analyses used in a study.
@@ -216,8 +212,45 @@ specific set of tasks that can be applied across numerous research projects.
216212
As such a Python package is more generalizable than a Research Compendium
217213
which supports a specific project.
218214

215+
* [Read about `Good enough practices in scientific computing`](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510)
216+
* [Learn more about research compendia (also called repo-packs) in this blog post.](https://lorenabarba.com/blog/how-repro-packs-can-save-your-future-self/)
219217
:::
220218

219+
220+
Below are a few examples well scoped pyOpenSci packages:
221+
222+
* [Crowsetta](https://crowsetta.readthedocs.io/en/latest/): is a package designed to work with annotating animal vocalizations and bioacoustics data. This package helps scientists process different types of bioacoustic data rather than focusing on a specific individual research application associated with a user-specific research workflow.
223+
* [pandera](https://www.union.ai/pandera) is another more broadly used Python package. Pandera supports data testing and thus also has a broader research application.
224+
225+
:::{admonition} Matplotlib as an example
226+
227+
At the larger end of the user spectrum, Matplotlib is a great example.
228+
Matplotlib does one (big important) thing really well:
229+
230+
*It creates visual plots of data.*
231+
232+
Matplotlib is used by thousands of users for different plotting applications
233+
using different types of data. While few scientific packages will have the same
234+
broad application and large user base as tools like Matplotlib, the
235+
idea of scoping out what your package does is still important.
236+
:::
237+
238+
### Code should also be clean & readable & documented
239+
240+
The code in your package should also be clean, readable and well documented.
241+
242+
**Clean code:** Clean code refers to code that uses expressive variable names,
243+
is concise and does not repeat itself. We will dive deeper into best practices
244+
for clean code in future pyOpenSci tutorials.
245+
246+
**Readable code:** Readable code is code written with a consistent style.
247+
You can use linters and code formatters such as black and flake8 to ensure
248+
this consistency throughout your entire package. [Learn more about code formatters here.](../package-structure-code/code-style-linting-format.html)
249+
250+
**Documented code:** documented code is written using docstrings that help a
251+
user understand both what the functions and methods in your code do and also
252+
what the input and output elements of each function is. [You can learn more about docstrings in our guide, here.](../documentation/write-user-documentation/document-your-code-api-docstrings)
253+
221254
## Making your package installable - publishing to PyPI & conda-forge
222255

223256
### Python packages and environments

0 commit comments

Comments
 (0)