Skip to content

Commit 86bffdc

Browse files
authored
Merge pull request #89 from pythonhealthdatascience/dev
Dev
2 parents d181293 + d1a289a commit 86bffdc

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+5763
-436
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,8 @@ RETICULATE_CONDA=/home/amy/mambaforge/bin/conda
138138

139139
To cite this work, see the `CITATION.cff` file in this repository or use the "Cite this repository" button on GitHub.
140140

141+
You can also cite the archived version of this work on Zenodo: https://doi.org/10.5281/zenodo.17094155.
142+
141143
<br>
142144

143145
## Linting

_quarto.yml

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,8 @@ website:
4747
- pages/model/process.qmd
4848
- section: "Output analysis"
4949
contents:
50-
- pages/output_analysis/outputs.qmd
5150
- pages/output_analysis/warmup.qmd
51+
- pages/output_analysis/outputs.qmd
5252
- pages/output_analysis/replications.qmd
5353
- pages/output_analysis/n_reps.qmd
5454
- pages/output_analysis/parallel.qmd
@@ -103,6 +103,19 @@ website:
103103
Code licence: <a href="https://opensource.org/license/mit" target="_blank" rel="noopener">MIT</a>.
104104
Text licence: <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" rel="noopener">CC-BY-SA 4.0</a>.
105105
106+
comments:
107+
giscus:
108+
repo: "pythonhealthdatascience/des_rap_book"
109+
repo-id: "R_kgDOOXKhOA"
110+
category: "Announcements"
111+
category-id: "DIC_kwDOOXKhOM4CuWAj"
112+
mapping: "pathname"
113+
reactions-enabled: true
114+
loading: "lazy"
115+
input-position: "bottom"
116+
theme: "light"
117+
language: "en"
118+
106119
format:
107120
html:
108121
theme: cosmo
@@ -121,8 +134,8 @@ format:
121134
filters:
122135
- filters/guidelines-filter.lua
123136
include-after-body:
124-
text: |
125-
<script type="application/javascript" src="../../scripts/language-selector.js"></script>
137+
- scripts/webex.js
138+
- scripts/language-selector.js
126139

127140
params:
128141
language: "python" # Default language parameter

index.qmd

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,6 @@ This practical guide shows you how to build **reproducible** discrete-event simu
2323

2424
This resource is an output of **STARS**, a research project led by Associate Prof. **Tom Monks** [![ORCID](images/orcid.png)](https://orcid.org/0000-0003-2631-4481). The book is written by **Amy Heather** [![ORCID](images/orcid.png)](https://orcid.org/0000-0002-6596-3479) and reviewed by Prof. **Nav Mustafee** [![ORCID](images/orcid.png)](https://orcid.org/0000-0002-2204-8924), Dr. **Alison Harper** [![ORCID](images/orcid.png)](https://orcid.org/0000-0001-5274-5037), and Associate Prof. **Tom Monks** [![ORCID](images/orcid.png)](https://orcid.org/0000-0003-2631-4481). The STARS project is supported by the Medical Research Council [grant number MR/Z503915/1]. The listed researchers are associated with the **University of Exeter** Medical and Business Schools.
2525

26-
> *Please **cite** us if you use this resource!* <!--TODO: Add Zenodo DOI once archive-->
26+
> *Please **cite** us if you use this resource!*
27+
>
28+
> Heather, A., Monks, T., Mustafee, N., & Harper, A. (2025). DES RAP Book: Reproducible Discrete-Event Simulation in Python and R. https://github.com/pythonhealthdatascience/des_rap_book. https://doi.org/10.5281/zenodo.17094155.

pages/_metadata.yml

Lines changed: 0 additions & 17 deletions
This file was deleted.

pages/inputs/input_data.qmd

Lines changed: 15 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -11,20 +11,11 @@ title: Input data management
1111

1212
::: {.pale-blue}
1313

14-
🎯 **Objectives**
14+
**Learning objectives:**
1515

16-
This page provides guidance on managing data and parameters for simulation projects.
17-
18-
* **🧾 Input data:** Understand the types of input data.
19-
* **📦 What is included in a RAP?** Advice on which data should be shared to ensure a reproducible analytical pipeline (RAP).
20-
* **🗃️ Raw data:** Recommendations on storage and sharing.
21-
* **📜 Input modelling code:** Recommendations on storage and sharing.
22-
* **⚙️ Parameters:** Recommendations on storage and sharing.
23-
* **🔐 Maintaining a private and public version of your model:** Advice for projects with sensitive data.
24-
25-
[🔗](../intro/guidelines.qmd) **Reproducibility guidelines**
26-
27-
While not directly meeting specific criteria, this page explains the importance of sharing input data for a RAP, and how this can be managed when there is sensitive data.
16+
* Recognise where a **reproducible analytical pipeline** begins, and what data is included.
17+
* Learn recommended practices for **storing and sharing raw data, input modelling code, and parameters**.
18+
* Understand how **private and public versions** of a model could be maintained when there is sensitive data.
2819

2920
:::
3021

@@ -246,15 +237,18 @@ The way you might set these up depends on whether you are allowed to share the r
246237

247238
## 🧪 Test yourself
248239

249-
```{r, echo = FALSE}
240+
```{r}
241+
#| echo: false
250242
library(webexercises) # nolint: library_call_linter
251243
```
252244

253245
:::{.callout-note}
254246

255247
## If your raw (e.g. patient-level) data cannot be shared, which of the following is recommended?
256248

257-
```{r, results="asis", echo = FALSE}
249+
```{r}
250+
#| output: asis
251+
#| echo: false
258252
cat(longmcq(c(
259253
"Do not share or describe anything.",
260254
answer = paste0(
@@ -270,7 +264,9 @@ cat(longmcq(c(
270264

271265
## Even if it cannot be shared publicly, input modelling code be retained so parameters can be re-estimated if new data or assumptions arise.
272266

273-
```{r, results="asis", echo = FALSE}
267+
```{r}
268+
#| output: asis
269+
#| echo: false
274270
cat(longmcq(c(
275271
answer = "True",
276272
"False"
@@ -283,7 +279,9 @@ cat(longmcq(c(
283279

284280
## What is a good strategy when maintaining a public and private version of a model?
285281

286-
```{r, results="asis", echo = FALSE}
282+
```{r}
283+
#| output: asis
284+
#| echo: false
287285
cat(longmcq(c(
288286
"Duplicate all simulation code across both repositories.",
289287
answer = paste(

pages/inputs/input_modelling.qmd

Lines changed: 45 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -6,29 +6,57 @@ bibliography: input_modelling_resources/references.bib
66

77
{{< include ../../scripts/_reticulate-setup.md >}}
88

9-
::: {.pale-blue}
9+
:::: {.pale-blue}
1010

11-
🎯 **Objectives**
11+
**Learning objectives:**
1212

13-
This page has step-by-step instructions for input modelling in Python or R, with inspiration from @Robinson2007 and @Monks2024.
13+
* Identify what **data** is needed for input modelling and why **quality** matters.
14+
* Understand how input data forms the basis for **randomness** in simulated systems.
15+
* **Inspect, fit, and select probability distributions** for your model using both **targeted and comprehensive** approaches.
1416

15-
* **📂 Data:** Identify what data is needed for input modelling and why quality matters.
16-
* **➡️ How is this data used in the model?** Understand how input data forms the basis for randomness in simulated systems.
17-
* **📈 Input modelling:** Inspect, fit, and select probability distributions for your model using both targeted and comprehensive approaches, with steps:
18-
* **🔧 Set-up**
19-
* **🔍 Step 1. Identify possible distributions**
20-
* **📊 Step 2. Fit distributions and compare goodness-of fit**
21-
* **✅ Step 3. Choose distributions**
22-
* **⚙️ Parameters**
17+
**Required packages:**
2318

24-
For advice on making your input modelling workflow **reproducible** and **sharing data or scripts with sensitive content**, see the [page on input data management](input_data.qmd).
19+
This should be available from environment setup in the "🧪 Test yourself" section of [Environments](../setup/environment.qmd).
2520

26-
[🔗](../intro/guidelines.qmd) **Reproducibility guidelines**
21+
::: {.python-content}
22+
23+
```{python}
24+
from distfit import distfit
25+
import numpy as np
26+
import pandas as pd
27+
import plotly.express as px
28+
import plotly.graph_objects as go
29+
from scipy import stats
30+
```
31+
32+
```{python}
33+
#| echo: false
34+
# To ensure renders correctly in quarto
35+
import plotly.io as pio
36+
pio.renderers.default = "plotly_mimetype"
37+
```
38+
39+
:::
40+
41+
::: {.r-content}
2742

28-
While not directly meeting specific criteria, this page encourages recording clear, reproducible input modelling processes to improve transparency and verification in your simulation work.
43+
```{r}
44+
#| output: false
45+
library(dplyr)
46+
library(fitdistrplus)
47+
library(ggplot2)
48+
library(lubridate)
49+
library(plotly)
50+
library(readr)
51+
library(tidyr)
52+
```
2953

3054
:::
3155

56+
**Acknowledgements:** Inspired by @Robinson2007 and @Monks2024.
57+
58+
::::
59+
3260
## 📂 Data
3361

3462
To build a DES model, you first need **data** that reflects the system you want to model. In healthcare, this might mean you need to access healthcare records with patient arrival, service and departure times, for example. The quality of your simulation depends directly on the quality of your data. Key considerations include:
@@ -102,44 +130,6 @@ touch rmarkdown/input_modelling.Rmd
102130

103131
:::
104132

105-
Before you begin, ensure the following packages are available. They should already be installed if you set up the environment in the "*🧪 Test yourself*" section of the [Environments](../setup/environment.qmd) page.
106-
107-
::: {.python-content}
108-
109-
```{python}
110-
# Import required packages
111-
from distfit import distfit
112-
import numpy as np
113-
import pandas as pd
114-
import plotly.express as px
115-
import plotly.graph_objects as go
116-
from scipy import stats
117-
```
118-
119-
```{python, echo=FALSE}
120-
import plotly.io as pio
121-
pio.renderers.default = "plotly_mimetype"
122-
```
123-
124-
:::
125-
126-
::: {.r-content}
127-
128-
```{r, message=FALSE}
129-
# nolint start: undesirable_function_linter.
130-
# Import required packages
131-
library(dplyr)
132-
library(fitdistrplus)
133-
library(ggplot2)
134-
library(lubridate)
135-
library(plotly)
136-
library(readr)
137-
library(tidyr)
138-
# nolint end
139-
```
140-
141-
:::
142-
143133
## 🔍 Step 1. Identify possible distributions
144134

145135
You first need to select which distributions to fit to your data. You should both:
@@ -522,7 +512,8 @@ inspect_histogram <- function(
522512
}
523513
```
524514

525-
```{r, warning=FALSE}
515+
```{r}
516+
#| warning: false
526517
# Plot histogram of inter-arrival times
527518
inspect_histogram(
528519
data = data, var = "iat_mins", x_lab = "Inter-arrival time (min)",
@@ -1007,7 +998,7 @@ If you haven't already followed along, **now's the time to put everything from t
1007998
::: {.python-content}
1008999
* Download the arrival data, and create a Jupyter notebook for your analysis.
10091000
:::
1010-
::: {.python-content}
1001+
::: {.r-content}
10111002
* Download the arrival data, and create an R markdown file for your analysis.
10121003
:::
10131004

pages/inputs/parameters_file.qmd

Lines changed: 37 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -4,30 +4,53 @@ title: Parameters from file
44

55
{{< include ../../scripts/_reticulate-setup.md >}}
66

7-
::: {.pale-blue}
7+
```{python}
8+
#| echo: false
9+
# pylint: disable=wrong-import-position,reimported,wrong-import-order
10+
# pylint: disable=ungrouped-imports,too-many-instance-attributes
11+
# pylint: disable=too-many-arguments, too-many-positional-arguments
12+
# pylint: disable=too-few-public-methods,redefined-outer-name
13+
# pylint: disable=function-redefined
14+
```
815

9-
🎯 **Objectives**
16+
:::: {.pale-blue}
1017

11-
Discrete-event simulations (DES) require many parameters - like arrival rates, resource times, and probabilities - which often need to be changed for different scenarios and analyses. Managing these parameters well makes your simulations easier to update, track, and reuse.
18+
**Learning objectives:**
1219

13-
This page focuses on the storage of parameters within a file.
20+
* Know the **advantages** of using external parameter files in simulation workflows
21+
* Learn how to create a **parameter file** and **data dictionary**.
22+
* Understand methods for **importing** parameters.
1423

15-
* **❓ Why use external parameter files?**
16-
* **📝 Create parameter file**
17-
* **📖 Create data dictionary**
18-
* **📥 Methods for importing parameters**
24+
**Relevant reproducibility guidelines:**
1925

20-
If you want to see how to store parameters within a script, see the [parameters from script](parameters_script.qmd) page.
26+
* STARS Reproducibility Recommendations: Avoid hard-coded parameters.
27+
* NHS Levels of RAP (🥈): Data is handled and output in a Tidy data format.
2128

22-
[🔗](../intro/guidelines.qmd) **Reproducibility guidelines**
29+
**Required packages:**
2330

24-
This page helps you meet reproducibility criteria from:
31+
This should be available from environment setup in the "🧪 Test yourself" section of [Environments](../setup/environment.qmd).
2532

26-
* STARS Reproducibility Recommendations: Avoid hard-coded parameters.
27-
* NHS Levels of RAP (🥈): Data is handled and output in a Tidy data format.
33+
::: {.python-content}
34+
35+
```{python}
36+
import pandas as pd
37+
import json
38+
from collections import defaultdict
39+
```
40+
41+
:::
42+
43+
::: {.r-content}
44+
45+
```{r}
46+
library(jsonlite)
47+
library(R6)
48+
```
2849

2950
:::
3051

52+
::::
53+
3154
:::: {.python-content}
3255

3356
::: {.callout-note title="Utility function" collapse="true"}
@@ -90,33 +113,8 @@ def print_dict(dictionary, max_items_per_level):
90113

91114
:::
92115

93-
```{python, echo=FALSE}
94-
# pylint: disable=wrong-import-position,reimported,wrong-import-order
95-
# pylint: disable=ungrouped-imports,too-many-instance-attributes
96-
# pylint: disable=too-many-arguments, too-many-positional-arguments
97-
# pylint: disable=too-few-public-methods,redefined-outer-name
98-
# pylint: disable=function-redefined
99-
```
100-
101-
```{python}
102-
import pandas as pd
103-
import json
104-
from collections import defaultdict
105-
```
106-
107116
::::
108117

109-
::: {.r-content}
110-
111-
```{r}
112-
# nolint start: undesirable_function_linter
113-
library(jsonlite)
114-
library(R6)
115-
# nolint end
116-
```
117-
118-
:::
119-
120118
## ❓ Why use external parameter files?
121119

122120
External parameter files can offer some advantages over storing parameters directly in scripts:
@@ -204,10 +202,8 @@ As mentioned in the [checklist for managing parameter data](input_data.qmd#check
204202

205203
The examples below were created using markdown (converted to PDF using `pandoc`), but you can use any suitable format (e.g. CSV, YAML, etc.) - as long as it is clear, consistent and accessible.
206204

207-
```{python}
205+
```{.python}
208206
#| echo: false
209-
#| output: false
210-
211207
import subprocess
212208
213209
subprocess.run([
95.5 KB
Binary file not shown.
Binary file not shown.

0 commit comments

Comments
 (0)