Skip to content

Commit 1e0ab3d

Browse files
authored
Merge pull request #3 from csiro-data-school/spr-0124
Updates to chapters 4-9 ahead of CSIRO Data School Jan '24 inc. new page on Agile methodology
2 parents a36ff42 + 5630112 commit 1e0ab3d

File tree

6 files changed

+74
-28
lines changed

6 files changed

+74
-28
lines changed

config.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ episodes:
5353
- 06-track_changes.md
5454
- 07-manuscripts.md
5555
- 08-what_next.md
56+
- 09-agile.md
5657

5758
# Information for Learners
5859
learners:

episodes/04-collaboration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,7 @@ tasks in various ways.
198198
![](fig/ms-tasks-list-view.png){alt="An example of Teams Tasks list view"}
199199

200200
- [Jira](https://jira.csiro.au/) is another tool supported and deployed in CSIRO. Developed by
201-
Australian software company [Atlassian](https://www.atlassian.com/software/jira, it allows
201+
Australian software company [Atlassian](https://www.atlassian.com/software/jira), it allows
202202
tracking of to-do tasks/issues and sub-tasks, lets you assign tasks to people, and lets
203203
you track and view tasks in the context of worflows, timelines, and "board" visualisations,
204204
such as the "Kanban board". Jira can directly integrate with both BitBucket

episodes/05-project_organization.md

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -113,14 +113,14 @@ files that perform the core analysis of the research, such as data
113113
cleaning or statistical analyses. These files can be thought of as
114114
the "scientific guts" of the project.
115115

116-
The second type of file in `src` is controller or driver scripts
117-
that contains all the analysis steps for the entire project
116+
Another type of file that might go in `src` is controller/driver/workflow scripts
117+
that contain all the analysis steps of a project
118118
from start to finish, with particular parameters and data
119119
input/output commands. A controller script for a simple project, for
120120
example, may read a raw data table, import and apply several cleanup
121121
and analysis functions from the other files in this directory, and
122122
create and save a numeric result. For a small project with one main
123-
output, a single controller script should be placed in the main
123+
output, a single controller script could be placed in the main
124124
`src` directory and distinguished clearly by a name such as
125125
"runall". The short example below is typical of
126126
scripts of this kind; note how it uses one variable, `TEMP_DIR`, to
@@ -140,9 +140,21 @@ avoid repeating the name of a particular directory four times.
140140
rm -rf $(TEMP_DIR)
141141
```
142142

143+
::::::::::::::::::::::::::::::::::::::::: callout
144+
145+
**Important note:** Don't place information specific to your own computer/system
146+
or self in these types of files, especially if they are being Git-tracked. Use
147+
relative paths instead of full paths where possible (e.g. input as `../data/` rather
148+
than `/home/xyz123/project/data`). Don't include any passwords or keys.
149+
If personal or system-specific information is required for your workflow, then make
150+
use of locally set environment variables and/or git-ignored files and then document
151+
how to set up these inputs again for anyone (or future self) re-using your work.
152+
153+
::::::::::::::::::::::::::::::::::::::::::::::::::
154+
143155
## Put compiled programs in the `bin` directory
144156

145-
`bin` contains
157+
A directory named `bin` is usually used to contain
146158
executable programs compiled from code in the `src` directory.
147159
Projects that do not have any will not require `bin`.
148160

@@ -193,9 +205,9 @@ simple project might be organized following these recommendations:
193205

194206
```
195207
.
196-
|-- CITATION
197-
|-- README
198-
|-- LICENSE
208+
|-- CITATION.cff
209+
|-- README.md
210+
|-- LICENSE.md
199211
|-- requirements.txt
200212
|-- data
201213
| -- birds_count_table.csv

episodes/06-track_changes.md

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -217,12 +217,7 @@ approach—the one we use in our own projects–don't just accelerate the
217217
manual process: they also automate some steps while enforcing others,
218218
and thereby require less self-discipline for more reliable results.
219219

220-
1. ***Use a version control
221-
system***, to manage changes to a
222-
project.
223-
224-
Box 2 briefly explains how version control systems work. It's hard to
225-
know what version control tool is most widely used in research today,
220+
It's hard to know what version control tool is most widely used in research today,
226221
but the one that's most talked about is undoubtedly Git. This is largely because of
227222
GitHub, a popular hosting site that combines the technical infrastructure for collaboration via Git with a
228223
modern web interface. GitHub is free for public and open source projects
@@ -231,11 +226,11 @@ GitLab is a well-regarded alternative
231226
that some prefer, because the GitLab platform itself is free and open
232227
source. Bitbucket provides free hosting
233228
for both Git and Mercurial repositories, but does not have nearly as
234-
many scientific users.
229+
many scientific users. CSIRO hosts it's own instance of BitBucket for employee use.
235230

236231
::::::::::::::::::::::::::::::::::::::::: callout
237232

238-
## Box 2: How Version Control Systems Work
233+
## How Version Control Systems Work
239234

240235
A version control system stores snapshots of a project's files in a
241236
repository. Users modify their working copy of the project, and then
@@ -244,7 +239,7 @@ and/or share their work with colleagues. The version control system
244239
automatically records when the change was made and by whom along with
245240
the changes themselves.
246241

247-
Crucially, if several people have edited files simultaneously, the
242+
Crucially for collaboration, if several people have edited files simultaneously, the
248243
version control system will detect the collision and require them to
249244
resolve any conflicts before recording the changes. Modern version
250245
control systems also allow repositories to be synchronized with each

episodes/07-manuscripts.md

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -117,15 +117,9 @@ Our first alternative will already be familiar to many researchers:
117117
With the document online, everyone's changes are in one place, and
118118
hence don't need to be merged manually.
119119

120-
We realize that in many cases, even this solution is asking too much
121-
from collaborators who see no reason to move forward from desktop GUI
122-
tools. To satisfy them, the manuscript can be converted to a desktop
123-
editor file format (e.g., Microsoft Word `.docx` or LibreOffice
124-
`.odt`) after major changes, then downloaded and saved in the `doc`
125-
folder. Unfortunately, this means merging some changes and suggestions
126-
manually, as existing tools cannot always do this automatically when
127-
switching from a desktop file format to text and back (although
128-
[Pandoc](https://pandoc.org/) can go a long way).
120+
This is easy under our current Microsoft Office organisational setup,
121+
where Word documents (and others) may be converted to shared online
122+
documents automatically when sharing through Outlook or Teams.
129123

130124
## Text-based Documents Under Version Control
131125

@@ -193,8 +187,6 @@ In groups, discuss:
193187

194188
## Getting started writing text-based version control
195189

196-
[Version Control with Git](https://swcarpentry.github.io/git-novice/) Carpentries lesson introduces text-based version control, that you could use for a collaborative manuscript.
197-
198190
[Manubot](https://manubot.org) is an open-source system for writing scholarly manuscripts via GitHub, with tutorials.
199191

200192

episodes/09-agile.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
title: 'Agile'
3+
teaching: 60
4+
exercises: 0
5+
---
6+
7+
::::::::::::::::::::::::::::::::::::::: objectives
8+
9+
- Learn some basic concepts of the 'Agile' methodology
10+
11+
::::::::::::::::::::::::::::::::::::::::::::::::::
12+
13+
## What is 'Agile'
14+
15+
'Agile' is a project management methodology, particularly for software development,
16+
built around a 4 point philosophical [manifesto](https://agilemanifesto.org/)
17+
and a 12 point set of [principles](https://agilemanifesto.org/principles.html).
18+
19+
Agile is typified by small teams that self-organise ('scrum') on how they will
20+
address a backlog of requested work, in short cycles ('sprints'), by breaking
21+
problems into small tasks, with frequent feedback and result delivery. It is a
22+
highly iterative approach to planning, that allows for high flexibility and less
23+
forward planning. A sprint may last 1-4 weeks, in which time an entire cycle of
24+
planning, designing, implmenting, testing and delivering takes place, with small
25+
tasks hopefully addressed to completion, followed by a review and retrospective
26+
that may or may not end up influencing the next sprint cycle.
27+
28+
[Framework at a glance diagram](https://www.planview.com/resources/guide/agile-methodologies-a-beginners-guide/basics-benefits-agile-method/)
29+
30+
[Contrast to waterfall model](https://www.guru99.com/agile-methodology-in-software-testing.html)
31+
32+
[Roles and user stories](https://www.tutorialspoint.com/agile/agile_primer.htm)
33+
34+
[Atlassian on scrums, Kanban and Jira visualisations](https://www.atlassian.com/agile/project-management)
35+
36+
37+
:::::::::::::::::::::::::::::::::::::::: keypoints
38+
39+
- The Agile approach is to break problems into smaller tasks and fully address them
40+
in short, iterative work cycles (sprints), with each cycle ending in review and discussion
41+
before planning the next cycle.
42+
- Aspects of this approach may be useful in data science work.
43+
44+
::::::::::::::::::::::::::::::::::::::::::::::::::
45+
46+

0 commit comments

Comments
 (0)