|
3 | 3 | { |
4 | 4 | "cell_type": "markdown", |
5 | 5 | "metadata": {}, |
6 | | - "source": [ |
7 | | - "# Fit growth models and extract growth statistics\n", |
8 | | - "\n", |
9 | | - "This tutorial demonstrates how to fit growth models and extract growth statistics\n", |
10 | | - "using the growthcurves package.\n", |
11 | | - "\n", |
12 | | - "The analysis workflow includes:\n", |
13 | | - "1. Generating or loading growth data\n", |
14 | | - "2. Fitting **mechanistic** models (ODE-based, parametric)\n", |
15 | | - "3. Fitting **phenomenological** models (parametric and non-parametric)\n", |
16 | | - "4. Extracting growth statistics from all fits\n", |
17 | | - "5. Saving results for visualization\n", |
18 | | - "\n", |
19 | | - "For visualization of the results, see the companion notebook:\n", |
20 | | - "[`plotting.ipynb`](plotting.ipynb) (Visualize fitted growth curves, derivatives,\n", |
21 | | - " and growth statistics)" |
22 | | - ] |
| 6 | + "source": "# Fit growth models and extract growth statistics\n\nThis tutorial demonstrates how to fit growth models and extract growth statistics\nusing the growthcurves package.\n\nThe analysis workflow includes:\n1. Generating or loading growth data\n2. Fitting **mechanistic** models (ODE-based, parametric)\n3. Fitting **phenomenological** models (parametric and non-parametric)\n4. Extracting growth statistics from all fits\n5. Saving results for visualization\n\n\nFor preprocessing examples (blank subtraction, outlier detection, path length correction), see the companion notebook:\n[`preprocessing.ipynb`](preprocessing.ipynb).\n\nFor visualization of the results, see the companion notebook:\n[`plotting.ipynb`](plotting.ipynb) (Visualize fitted growth curves, derivatives,\n and growth statistics)" |
23 | 7 | }, |
24 | 8 | { |
25 | 9 | "cell_type": "code", |
|
35 | 19 | "import growthcurves as gc" |
36 | 20 | ] |
37 | 21 | }, |
38 | | - { |
39 | | - "cell_type": "markdown", |
40 | | - "metadata": {}, |
41 | | - "source": [ |
42 | | - "## Data Preprocessing Functions\n", |
43 | | - "\n", |
44 | | - "The growthcurves package provides preprocessing utilities for common data corrections:\n", |
45 | | - "\n", |
46 | | - "- **`path_correct(N, path_length_cm)`**: Normalize OD measurements to 1 cm path length\n", |
47 | | - "- **`blank_subtraction(N, blank)`**: Subtract blank/background measurements from data" |
48 | | - ] |
49 | | - }, |
50 | | - { |
51 | | - "cell_type": "code", |
52 | | - "execution_count": 2, |
53 | | - "metadata": { |
54 | | - "tags": [ |
55 | | - "hide-input" |
56 | | - ] |
57 | | - }, |
58 | | - "outputs": [ |
59 | | - { |
60 | | - "name": "stdout", |
61 | | - "output_type": "stream", |
62 | | - "text": [ |
63 | | - "Path Length Correction Example:\n", |
64 | | - " Raw OD (0.5 cm path): [0.25 0.3 0.35 0.4 ]\n", |
65 | | - " Corrected OD (1 cm path): [0.5 0.6 0.7 0.8]\n", |
66 | | - "\n", |
67 | | - "Blank Subtraction Example:\n", |
68 | | - " Sample OD: [0.5 0.6 0.7 0.8]\n", |
69 | | - " Blank OD: [0.05 0.052 0.048 0.051]\n", |
70 | | - " Corrected: [0.45 0.548 0.652 0.749]\n", |
71 | | - "\n", |
72 | | - "Combined Preprocessing Pipeline:\n", |
73 | | - " Raw measurements (0.5 cm): [0.125 0.15 0.175 0.2 ]\n", |
74 | | - " After path correction (1 cm): [0.25 0.3 0.35 0.4 ]\n", |
75 | | - " Blank (corrected to 1 cm):[0.05 0.05 0.05 0.05]\n", |
76 | | - " Final corrected OD: [0.2 0.25 0.3 0.35]\n", |
77 | | - "\n" |
78 | | - ] |
79 | | - } |
80 | | - ], |
81 | | - "source": [ |
82 | | - "# Example 1: Path length correction\n", |
83 | | - "# Measurements taken with a 0.5 cm path length, normalized to 1 cm\n", |
84 | | - "raw_od_measurements = np.array([0.25, 0.30, 0.35, 0.40])\n", |
85 | | - "path_length = 0.5 # cm\n", |
86 | | - "\n", |
87 | | - "od_corrected = gc.path_correct(raw_od_measurements, path_length)\n", |
88 | | - "\n", |
89 | | - "print(\"Path Length Correction Example:\")\n", |
90 | | - "print(f\" Raw OD (0.5 cm path): {raw_od_measurements}\")\n", |
91 | | - "print(f\" Corrected OD (1 cm path): {od_corrected}\")\n", |
92 | | - "print()\n", |
93 | | - "\n", |
94 | | - "# Example 2: Blank subtraction\n", |
95 | | - "# Typical workflow: subtract blank measurements from sample data\n", |
96 | | - "sample_data = np.array([0.500, 0.600, 0.700, 0.800])\n", |
97 | | - "blank_data = np.array([0.050, 0.052, 0.048, 0.051])\n", |
98 | | - "\n", |
99 | | - "corrected_data = gc.blank_subtraction(sample_data, blank_data)\n", |
100 | | - "\n", |
101 | | - "print(\"Blank Subtraction Example:\")\n", |
102 | | - "print(f\" Sample OD: {sample_data}\")\n", |
103 | | - "print(f\" Blank OD: {blank_data}\")\n", |
104 | | - "print(f\" Corrected: {corrected_data}\")\n", |
105 | | - "print()\n", |
106 | | - "\n", |
107 | | - "# Example 3: Combined preprocessing workflow\n", |
108 | | - "# Simulate a typical preprocessing pipeline\n", |
109 | | - "raw_measurements = np.array([0.125, 0.150, 0.175, 0.200])\n", |
110 | | - "blank_measurements = np.array([0.025, 0.025, 0.025, 0.025])\n", |
111 | | - "path_length_cm = 0.5\n", |
112 | | - "\n", |
113 | | - "# Step 1: Path correction\n", |
114 | | - "od_1cm = gc.path_correct(raw_measurements, path_length_cm)\n", |
115 | | - "\n", |
116 | | - "# Step 2: Blank subtraction\n", |
117 | | - "od_corrected = gc.blank_subtraction(\n", |
118 | | - " od_1cm, gc.path_correct(blank_measurements, path_length_cm)\n", |
119 | | - ")\n", |
120 | | - "\n", |
121 | | - "print(\"Combined Preprocessing Pipeline:\")\n", |
122 | | - "print(f\" Raw measurements (0.5 cm): {raw_measurements}\")\n", |
123 | | - "print(f\" After path correction (1 cm): {od_1cm}\")\n", |
124 | | - "print(\n", |
125 | | - " f\" Blank (corrected to 1 cm):{gc.path_correct(blank_measurements, path_length_cm)}\"\n", |
126 | | - ")\n", |
127 | | - "print(f\" Final corrected OD: {od_corrected}\")\n", |
128 | | - "print()" |
129 | | - ] |
130 | | - }, |
131 | 22 | { |
132 | 23 | "cell_type": "markdown", |
133 | 24 | "metadata": {}, |
|
137 | 28 | "This cell generates synthetic growth data from a clean logistic function.\n", |
138 | 29 | "- time is modeled in hours, with measurements every 12 minutes (0.2 hours) for\n", |
139 | 30 | " a total of 440 points (88 hours).\n", |
140 | | - "- We assume a lag of 30 hours, an intrinsic growth rate of 0.15 hour⁻¹,\n", |
| 31 | + "- We assume a lag of 30 hours, an intrinsic growth rate of 0.15 hour\u207b\u00b9,\n", |
141 | 32 | " and a carrying capacity of 0.45 OD." |
142 | 33 | ] |
143 | 34 | }, |
|
297 | 188 | "| Output key | Meaning | How it is calculated |\n", |
298 | 189 | "|---|---|---|\n", |
299 | 190 | "| `max_od` | Maximum observed/fitted OD | Maximum OD over the valid data range |\n", |
300 | | - "| `mu_max` | Maximum specific growth rate (μ_max) | Maximum of `d(ln N)/dt` from the fitted model (or local fit for non-parametric) |\n", |
301 | | - "| `intrinsic_growth_rate` | Intrinsic model rate parameter | For mechanistic models: fitted intrinsic `μ`; for phenomenological/non-parametric: `None` |\n", |
| 191 | + "| `mu_max` | Maximum specific growth rate (\u03bc_max) | Maximum of `d(ln N)/dt` from the fitted model (or local fit for non-parametric) |\n", |
| 192 | + "| `intrinsic_growth_rate` | Intrinsic model rate parameter | For mechanistic models: fitted intrinsic `\u03bc`; for phenomenological/non-parametric: `None` |\n", |
302 | 193 | "| `doubling_time` | Doubling time in hours | `ln(2) / mu_max` |\n", |
303 | 194 | "| `time_at_umax` | Time at maximum specific growth | Time where `mu_max` reaches its maximum |\n", |
304 | | - "| `od_at_umax` | OD at time of μ_max | Model-predicted OD at `time_at_umax` |\n", |
| 195 | + "| `od_at_umax` | OD at time of \u03bc_max | Model-predicted OD at `time_at_umax` |\n", |
305 | 196 | "| `exp_phase_start`, `exp_phase_end` | Exponential phase boundaries | From threshold or tangent phase-boundary method in `extract_stats()` |\n", |
306 | 197 | "| `model_rmse` | Fit error | RMSE between observed OD and model-predicted OD over the model fit window |\n", |
307 | 198 | "\n", |
|
319 | 210 | "The `extract_stats_from_fit()` function calculates these key metrics:\n", |
320 | 211 | "\n", |
321 | 212 | "- `max_od`: Maximum OD value within the fitted window\n", |
322 | | - "- `mu_max`: **Observed** maximum specific growth rate μ_max (hour⁻¹) - calculated\n", |
| 213 | + "- `mu_max`: **Observed** maximum specific growth rate \u03bc_max (hour\u207b\u00b9) - calculated\n", |
323 | 214 | " from the fitted curve\n", |
324 | 215 | "- `intrinsic_growth_rate`: **Model parameter** for intrinsic growth rate\n", |
325 | 216 | " (parametric models only, `None` for non-parametric)\n", |
326 | 217 | "- `doubling_time`: Time to double the population at peak growth (hours)\n", |
327 | 218 | "- `exp_phase_start`: When exponential phase begins (hours)\n", |
328 | 219 | "- `exp_phase_end`: When exponential phase ends (hours)\n", |
329 | | - "- `time_at_umax`: Time when μ reaches its maximum (hours)\n", |
330 | | - "- `od_at_umax`: OD value at time of maximum μ\n", |
| 220 | + "- `time_at_umax`: Time when \u03bc reaches its maximum (hours)\n", |
| 221 | + "- `od_at_umax`: OD value at time of maximum \u03bc\n", |
331 | 222 | "- `fit_t_min`: Start of fitting window (hours)\n", |
332 | 223 | "- `fit_t_max`: End of fitting window (hours)\n", |
333 | 224 | "- `fit_method`: Identifier for the method used\n", |
|
339 | 230 | "\n", |
340 | 231 | "### MECHANISTIC MODELS\n", |
341 | 232 | "\n", |
342 | | - "| Name | Model | Equation | Exp Start | Exp End | Intrinsic μ | μ max | Carrying Capacity | Fit |\n", |
| 233 | + "| Name | Model | Equation | Exp Start | Exp End | Intrinsic \u03bc | \u03bc max | Carrying Capacity | Fit |\n", |
343 | 234 | "|------|-------|----------|-----------|---------|-------------|-------|-------------------|-----|\n", |
344 | | - "| Logistic | parametric | `dN/dt = μ * (1 - N(t) / K) * N(t)` | threshold/<br>tangent | threshold/<br>tangent | μ | max dln(N)/dt | K | entire curve |\n", |
345 | | - "| Gompertz | parametric | `dN/dt = μ * math.log(K / N(t)) * N(t)` | threshold/<br>tangent | threshold/<br>tangent | μ | max dln(N)/dt | K | entire curve |\n", |
346 | | - "| Richards | parametric | `dN/dt = μ * (1 - (N(t) / K)**beta) * N(t)` | threshold/<br>tangent | threshold/<br>tangent | μ | max dln(N)/dt | A | entire curve |\n", |
347 | | - "| Baranyi | parametric | `dN/dt= μ * math.exp(μ * t) / (math.exp(h0) - 1 + math.exp(μ * t)) * (1 - N(t) / K) * N(t)` | threshold/<br>tangent | threshold/<br>tangent | μ | max dln(N)/dt | K | entire curve |\n", |
| 235 | + "| Logistic | parametric | `dN/dt = \u03bc * (1 - N(t) / K) * N(t)` | threshold/<br>tangent | threshold/<br>tangent | \u03bc | max dln(N)/dt | K | entire curve |\n", |
| 236 | + "| Gompertz | parametric | `dN/dt = \u03bc * math.log(K / N(t)) * N(t)` | threshold/<br>tangent | threshold/<br>tangent | \u03bc | max dln(N)/dt | K | entire curve |\n", |
| 237 | + "| Richards | parametric | `dN/dt = \u03bc * (1 - (N(t) / K)**beta) * N(t)` | threshold/<br>tangent | threshold/<br>tangent | \u03bc | max dln(N)/dt | A | entire curve |\n", |
| 238 | + "| Baranyi | parametric | `dN/dt= \u03bc * math.exp(\u03bc * t) / (math.exp(h0) - 1 + math.exp(\u03bc * t)) * (1 - N(t) / K) * N(t)` | threshold/<br>tangent | threshold/<br>tangent | \u03bc | max dln(N)/dt | K | entire curve |\n", |
348 | 239 | "\n", |
349 | 240 | "### PHENOMENOLOGICAL MODELS\n", |
350 | 241 | "\n", |
351 | | - "| Name | Model | Equation | Exp Start | Exp End | Intrinsic μ | μ max | Max OD | Fit |\n", |
| 242 | + "| Name | Model | Equation | Exp Start | Exp End | Intrinsic \u03bc | \u03bc max | Max OD | Fit |\n", |
352 | 243 | "|------|-------|----------|-----------|---------|-------------|-------|--------|-----|\n", |
353 | 244 | "| Linear | non-parametric | `ln(N(t)) = N0 + b * t` | threshold/<br>tangent | threshold/<br>tangent | n.a. | b | max OD raw | only window |\n", |
354 | 245 | "| Spline | non-parametric | `ln(N(t)) = spline(t)` | threshold/<br>tangent | threshold/<br>tangent | n.a. | max of derivative of spline | max OD raw | only log phase |\n", |
355 | | - "| Logistic (phenom) | parametric | `ln(N(t)/N0) = A / (1 + exp(4 * μ_max * (λ - t) / A + 2))` | λ | threshold/<br>tangent | n.a. | μ_max | K | entire curve |\n", |
356 | | - "| Gompertz (phenom) | parametric | `ln(N(t)/N0) = A * exp(-exp(μ_max * exp(1) * (λ - t) / A + 1))` | λ | threshold/<br>tangent | n.a. | μ_max | K | entire curve |\n", |
357 | | - "| Gompertz (modified) | parametric | `ln(N(t)/N0) = A * exp(-exp(μ_max * exp(1) * (λ - t) / A + 1)) + A * exp(α * (t - t_shift))` | λ | threshold/<br>tangent | n.a. | μ_max | K | entire curve |\n", |
358 | | - "| Richards (phenom) | parametric | `ln(N(t)/N0) = A * (1 + ν * exp(1 + ν + μ_max * (1 + ν)**(1/ν) * (λ - t) / A))**(-1/ν)` | λ | threshold/<br>tangent | n.a. | μ_max | K | entire curve |\n", |
| 246 | + "| Logistic (phenom) | parametric | `ln(N(t)/N0) = A / (1 + exp(4 * \u03bc_max * (\u03bb - t) / A + 2))` | \u03bb | threshold/<br>tangent | n.a. | \u03bc_max | K | entire curve |\n", |
| 247 | + "| Gompertz (phenom) | parametric | `ln(N(t)/N0) = A * exp(-exp(\u03bc_max * exp(1) * (\u03bb - t) / A + 1))` | \u03bb | threshold/<br>tangent | n.a. | \u03bc_max | K | entire curve |\n", |
| 248 | + "| Gompertz (modified) | parametric | `ln(N(t)/N0) = A * exp(-exp(\u03bc_max * exp(1) * (\u03bb - t) / A + 1)) + A * exp(\u03b1 * (t - t_shift))` | \u03bb | threshold/<br>tangent | n.a. | \u03bc_max | K | entire curve |\n", |
| 249 | + "| Richards (phenom) | parametric | `ln(N(t)/N0) = A * (1 + \u03bd * exp(1 + \u03bd + \u03bc_max * (1 + \u03bd)**(1/\u03bd) * (\u03bb - t) / A))**(-1/\u03bd)` | \u03bb | threshold/<br>tangent | n.a. | \u03bc_max | K | entire curve |\n", |
359 | 250 | "\n", |
360 | 251 | "### Understanding Growth Rates: Intrinsic vs. Observed\n", |
361 | 252 | "\n", |
362 | 253 | "**Important distinction:**\n", |
363 | 254 | "\n", |
364 | | - "- **`mu_max`** (μ_max): The **observed** maximum specific growth rate calculated\n", |
| 255 | + "- **`mu_max`** (\u03bc_max): The **observed** maximum specific growth rate calculated\n", |
365 | 256 | " from the fitted curve as max(d(ln N)/dt). This is what you measure from the data.\n", |
366 | 257 | "\n", |
367 | 258 | "- **`intrinsic_growth_rate`**: The **model parameter** representing intrinsic growth\n", |
|
1014 | 905 | "Two methods are available for determining exponential phase boundaries:\n", |
1015 | 906 | "\n", |
1016 | 907 | "### 1. **Threshold Method**\n", |
1017 | | - "- Tracks the instantaneous specific growth rate μ(t)\n", |
1018 | | - "- `exp_phase_start`: First time when μ exceeds a fraction of μ_max (default: 15%)\n", |
1019 | | - "- `exp_phase_end`: First time after peak when μ drops below the threshold\n", |
| 908 | + "- Tracks the instantaneous specific growth rate \u03bc(t)\n", |
| 909 | + "- `exp_phase_start`: First time when \u03bc exceeds a fraction of \u03bc_max (default: 15%)\n", |
| 910 | + "- `exp_phase_end`: First time after peak when \u03bc drops below the threshold\n", |
1020 | 911 | "\n", |
1021 | 912 | "### 2. **Tangent Method**\n", |
1022 | 913 | "- Constructs a tangent line in log space at the point of maximum growth rate\n", |
|
0 commit comments