Figure functions #108

ciaramccarthy1 · 2026-01-12T15:00:42Z

PR to create figures showing:

Visual comparison of fit and forecast for a subset of locations and forecast dates
Visual comparison of forecast performance across forecast dates for a subset of locations for horizons 1-7

Following on from #107, still need to:

Create plotting style for the figures
Write function to create model names for figures
Do some EDA of the scores

Summary by CodeRabbit

New Features
- Multilocation comparison plots to visualise forecasts and observations across multiple hospital sites and wastewater locations, with options for multi-date and single-date outputs and automated layout/saving.
- A custom LSHTM plot theme for consistent branding across figures.
Documentation
- User documentation added for the new plotting functionality and the custom plot theme.
Chores
- Updated package dependencies to support the new visualisation features.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-12T15:00:58Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The PR introduces multilocation comparison plotting capabilities: adds two exported functions (lshtm_theme, plot_multilocation_comparison), new Imports (patchwork, purrr), global variable declarations for plotting, a new targets pipeline for multilocation plots, and accompanying documentation and renv updates.

Changes

Cohort / File(s)	Summary
Package metadata `DESCRIPTION`	Added `patchwork` and `purrr` to Imports (alongside existing `future.callr`).
Namespace & exports `NAMESPACE`	Exported `lshtm_theme` and `plot_multilocation_comparison`; added many `importFrom` entries (dplyr, ggh4x, ggplot2, patchwork, grid, gridText, gridExtra, etc.).
Plotting functions and theme `R/plot_figures.R`	Added `lshtm_theme()` and `plot_multilocation_comparison()` (large new plotting routine assembling hospital and wastewater forecasts, uncertainty ribbons, per-location panels, and patchwork-based layouts).
Global variables `R/globals.R`	Added globals used by plotting: `quantile_level`, `predicted`, `date_parsed`, `q_0.5`, `q_0.25`, `q_0.75`, `q_0.05`, `q_0.95`, `forecast_date_model_ww`, `model_ww`, `observed`, `forecast_date_site`, `log_genome_copies_per_ml`.
Targets pipeline `targets/multilocation_plot_targets.R`, `_targets.R`	New `multilocation_plot_targets` and integration into `_targets.R`: targets to select forecast dates, discover locations, ingest hospital data, and produce/save multilocation plots (including single-date and location-specific variants).
Documentation `man/lshtm_theme.Rd`, `man/plot_multilocation_comparison.Rd`	Added Rd documentation for `lshtm_theme` and `plot_multilocation_comparison` describing usage, arguments and return values.
Environment tooling `renv/activate.R`	Bumped renv version (1.0.11 → 1.1.5) and refactored internal JSON-reading, bootstrap and platform-prefix helpers; adjusted token handling and signatures.

Possibly related PRs

Issue 610: add arima #71 — Modifies R/globals.R to add overlapping forecast quantile and plotting variable names (e.g., q_0.5, q_0.25, q_0.75, q_0.05, q_0.95).
Issue 66: Write a fit_wwinference_wrapper function #83 — Strong overlap: adds multilocation/model-comparison plotting functions and the same global variable names used by the new plotting code.
Issue 106: analysis pipeline #107 — Adds analysis/plotting functionality and targets; touches targets/ and R/globals.R, indicating related pipeline and plotting changes.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Figure functions' is vague and generic, failing to communicate the specific purpose or main change of the pull request despite the addition of multiple new plotting functions and comprehensive infrastructure.	Revise the title to be more descriptive and specific, such as 'Add multilocation comparison plotting functions and infrastructure' or 'Add lshtm_theme and plot_multilocation_comparison functions'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch figure-functions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @R/plot_figures.R:
- Around line 91-106: When locations is NULL the code uses
sample(available_locations, ...) without a seed which makes selection
non-deterministic; change this to a deterministic selection by sorting
available_locations and taking the first min(3, length(available_locations))
entries (replace the sample(...) call), or alternatively set a fixed seed
immediately before sample() to ensure reproducible picks; update the block that
assigns locations (referencing variables available_locations, locations,
sample(), forecast_dates, and first_forecast_path) accordingly.
- Line 85: The function accepts a hosp_data_long parameter that is never used;
either remove it from the function signature and the roxygen docs (remove the
documented lines 57-58) or wire it into the plotting flow by replacing the
in-function CSV-derived observed hospital extraction (the block that
reads/derives observed hospital data around the forecast CSV handling, currently
lines ~200-218) to use hosp_data_long instead; locate the parameter in
plot_figures.R and update the corresponding roxygen @param entry and any
examples, or adjust the code path that constructs observed hospital data so it
uses the provided hosp_data_long when non-NULL (fall back to CSV only if NULL).

🧹 Nitpick comments (5)

man/plot_multilocation_comparison.Rd (1)

24-26: Consider reproducibility when selecting random locations.

The documentation indicates that when locations = NULL, three random locations are selected. For reproducible outputs, consider either:

Documenting that users should set a seed before calling the function

Adding a seed parameter to the function for reproducibility

This is a minor concern for documentation/analysis purposes but worth noting.

R/globals.R (1)

113-125: Duplicate global variable declarations exist in this file.

Several of the newly added variables are already declared earlier in the file (e.g., q_0.5, q_0.25, q_0.75, q_0.05, q_0.95, quantile_level, predicted, forecast_date_model_ww, model_ww, log_genome_copies_per_ml). While this doesn't cause functional issues, it adds maintenance overhead. Consider whether roxyglobals can be configured to deduplicate these entries, or consolidate them manually if this file is edited directly.
R/plot_figures.R (1)
81-89: Consider adding input validation for forecast_dates.

If forecast_dates is empty or NULL, the function will fail at line 95 (forecast_dates[1]) with an unclear error. Early validation would provide a clearer error message.
Suggested validation
 plot_multilocation_comparison <- function(
     output_path,
     forecast_dates,
     locations = NULL,
     hosp_data_long,
     forecast_horizon_to_plot = 28,
     historical_data_to_plot = 90,
     scale_selected = "natural",
     save_path = NULL) {
+  if (is.null(forecast_dates) || length(forecast_dates) == 0) {
+    stop("forecast_dates must be a non-empty character vector")
+  }
+
   # Get locations from first forecast date if not specified
targets/multilocation_plot_targets.R (2)
86-86: Inconsistent namespace usage: read_csv should be readr::read_csv.

For consistency with the explicit namespace usage of dplyr:: functions on lines 88-90, and to avoid potential conflicts, use readr::read_csv.
Suggested fix
-          temp_data <- read_csv(file_path, show_col_types = FALSE)
+          temp_data <- readr::read_csv(file_path, show_col_types = FALSE)
119-136: Hardcoded locations may not exist in all datasets.

The locations_of_interest target hardcodes German cities ("Bremen", "Hamburg", "Berlin"). If running this pipeline on data from a different region or if these locations are missing, the multilocation_specific_plots target may fail or produce empty plots.

Consider intersecting with available_locations_multiloc to ensure only valid locations are used:
Suggested modification
   tar_target(
     name = locations_of_interest,
-    command = c("Bremen", "Hamburg", "Berlin")
+    command = {
+      desired <- c("Bremen", "Hamburg", "Berlin")
+      intersect(desired, available_locations_multiloc)
+    }
   ),

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 05a986c and 1ae7278.

⛔ Files ignored due to path filters (1)

renv.lock is excluded by !**/*.lock

📒 Files selected for processing (9)

DESCRIPTION
NAMESPACE
R/globals.R
R/plot_figures.R
_targets.R
man/lshtm_theme.Rd
man/plot_multilocation_comparison.Rd
renv/activate.R
targets/multilocation_plot_targets.R

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: lint-changed-files
GitHub Check: ubuntu-latest (release)
GitHub Check: windows-latest (release)
GitHub Check: macOS-latest (release)
GitHub Check: test-coverage
GitHub Check: render

🔇 Additional comments (10)

DESCRIPTION (1)

47-49: LGTM!

The new dependencies patchwork and purrr are appropriate additions for the multilocation plotting functionality. patchwork is commonly used for composing multiple ggplot objects, and purrr provides functional programming utilities that are already being used in the targets pipeline.

man/lshtm_theme.Rd (1)

1-19: LGTM!

The documentation is clear and appropriately describes the LSHTM custom theme function. The colour code and styling purpose are well documented.

renv/activate.R (1)

1-6: Auto-generated renv bootstrap file updated to version 1.1.5.

This file is auto-generated by renv and should not be manually edited. The update from the previous version includes internal improvements to platform prefix handling, JSON parsing, and token handling. These changes are standard upstream updates from the renv package.

man/plot_multilocation_comparison.Rd (1)

1-52: LGTM!

The documentation is comprehensive and clearly describes the function's purpose, parameters, and return value. The exclusion of ARIMA models and the reference to the original get_plot_model_comparison function from EDA_plots.R provides helpful context.

_targets.R (1)

97-105: Code integration looks good.

The multilocation_plot_targets is correctly added to the plot_targets list with proper syntax, integrating the new multilocation comparison plotting pipeline into the targets workflow.

NAMESPACE (1)

5-6: LGTM!

The new exports (lshtm_theme, plot_multilocation_comparison) and imports align correctly with the roxygen2 documentation in R/plot_figures.R. The additions for patchwork, ggh4x, and additional ggplot2 functions support the new multilocation plotting functionality.

Also applies to: 17-17, 36-36, 38-41, 53-53, 63-66

R/plot_figures.R (3)

13-41: LGTM!

The theme function is well-structured with clear LSHTM branding. The theme settings provide consistent styling for borders, backgrounds, grids, and text elements.

294-455: Well-structured multi-panel plot construction.

The approach of creating per-location plots and combining them with patchwork handles the variable number of wastewater sites per location effectively. The spacer logic ensures consistent alignment across rows.

507-527: LGTM!

The save logic with dynamic width calculation based on facet count is a good approach for handling varying plot sizes.

targets/multilocation_plot_targets.R (1)

15-42: The scores target is correctly defined in targets/scoring_targets.R (lines 55–61) and will be properly resolved as a dependency by the targets framework.

coderabbitai · 2026-01-12T15:04:56Z

R/plot_figures.R

+    output_path,
+    forecast_dates,
+    locations = NULL,
+    hosp_data_long,


⚠️ Potential issue | 🟠 Major

The hosp_data_long parameter is never used.

This parameter is documented and accepted but never referenced in the function body. The observed hospital data is instead extracted from the forecast CSV files (lines 200-218). Either remove this parameter or integrate it into the plotting logic.

If the parameter should be removed

plot_multilocation_comparison <- function( output_path, forecast_dates, locations = NULL, - hosp_data_long, forecast_horizon_to_plot = 28, historical_data_to_plot = 90, scale_selected = "natural", save_path = NULL) {

Also update the roxygen documentation to remove lines 57-58.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

hosp_data_long,

plot_multilocation_comparison <- function(

output_path,

forecast_dates,

locations = NULL,

forecast_horizon_to_plot = 28,

historical_data_to_plot = 90,

scale_selected = "natural",

save_path = NULL) {

🤖 Prompt for AI Agents

In @R/plot_figures.R at line 85, The function accepts a hosp_data_long parameter that is never used; either remove it from the function signature and the roxygen docs (remove the documented lines 57-58) or wire it into the plotting flow by replacing the in-function CSV-derived observed hospital extraction (the block that reads/derives observed hospital data around the forecast CSV handling, currently lines ~200-218) to use hosp_data_long instead; locate the parameter in plot_figures.R and update the corresponding roxygen @param entry and any examples, or adjust the code path that constructs observed hospital data so it uses the provided hosp_data_long when non-NULL (fall back to CSV only if NULL).

coderabbitai · 2026-01-12T15:04:56Z

R/plot_figures.R

+  if (is.null(locations)) {
+    first_forecast_path <- file.path(
+      output_path,
+      "individual_forecasts_all_runs",
+      forecast_dates[1]
+    )
+    available_locations <- list.dirs(
+      first_forecast_path,
+      full.names = FALSE,
+      recursive = FALSE
+    )
+    locations <- sample(
+      available_locations,
+      size = min(3, length(available_locations))
+    )
+  }


⚠️ Potential issue | 🟡 Minor

Random location selection may affect reproducibility.

When locations = NULL, sample() is called without a seed, producing different results each run. This could make debugging difficult and cause inconsistent outputs in the targets pipeline.

Consider either:

Setting a seed before sampling, or

Using a deterministic selection (e.g., first 3 alphabetically).

Suggested fix for deterministic selection

if (is.null(locations)) { first_forecast_path <- file.path( output_path, "individual_forecasts_all_runs", forecast_dates[1] ) available_locations <- list.dirs( first_forecast_path, full.names = FALSE, recursive = FALSE ) - locations <- sample( - available_locations, - size = min(3, length(available_locations)) - ) + locations <- sort(available_locations)[seq_len(min(3, length(available_locations)))] }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (is.null(locations)) {

first_forecast_path <- file.path(

output_path,

"individual_forecasts_all_runs",

forecast_dates[1]

)

available_locations <- list.dirs(

first_forecast_path,

full.names = FALSE,

recursive = FALSE

)

locations <- sample(

available_locations,

size = min(3, length(available_locations))

)

}

if (is.null(locations)) {

first_forecast_path <- file.path(

output_path,

"individual_forecasts_all_runs",

forecast_dates[1]

)

available_locations <- list.dirs(

first_forecast_path,

full.names = FALSE,

recursive = FALSE

)

locations <- sort(available_locations)[seq_len(min(3, length(available_locations)))]

}

🤖 Prompt for AI Agents

In @R/plot_figures.R around lines 91 - 106, When locations is NULL the code uses sample(available_locations, ...) without a seed which makes selection non-deterministic; change this to a deterministic selection by sorting available_locations and taking the first min(3, length(available_locations)) entries (replace the sample(...) call), or alternatively set a fixed seed immediately before sample() to ensure reproducible picks; update the block that assigns locations (referencing variables available_locations, locations, sample(), forecast_dates, and first_forecast_path) accordingly.

ciaramccarthy1 · 2026-01-13T15:29:23Z

@coderabbitai review

coderabbitai · 2026-01-13T15:29:31Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In @R/plot_figures.R:
- Around line 96-104: The code assumes forecast_dates is non-empty when
sampling; add an early validation check in the block that handles
!show_multiple_dates to detect empty forecast_dates and return or throw a clear
error (or set a default) before any sample() calls. Specifically, in the
function or scope using the forecast_dates and show_multiple_dates variables,
check length(forecast_dates) == 0 and handle it (e.g., stop("forecast_dates is
empty") or set forecast_dates to a safe default) so that the subsequent uses of
sample(forecast_dates, size = 1) (and the dates_2024 logic) cannot attempt to
sample from an empty vector.
- Around line 85-94: The function plot_multilocation_comparison declares a
unused parameter hosp_data_long but instead reads hospital CSVs inside the
function; either remove the parameter and its @param roxygen entry and update
callers (e.g., multilocation_plot_targets.R / remove computing
hosp_data_all_locations) or replace the internal CSV-read block with usage of
the hosp_data_long argument (match the expected columns/format and any
filtering/grouping currently done by the CSV-read code) so the precomputed
target is actually consumed; update roxygen and any callers accordingly.

In @targets/multilocation_plot_targets.R:
- Around line 15-42: The tar target selected_forecast_dates_multiloc currently
uses available_locations_multiloc[1] without declaring a dependency and also
grows valid_dates with repeated c() calls; fix by referencing
available_locations_multiloc at the top of the command block (e.g., assign
test_loc <- available_locations_multiloc[1] immediately) so targets sees the
dependency, and replace the O(n^2) growth pattern by collecting valid paths with
a vectorized or preallocated approach (use vapply/lapply + Filter or preallocate
a character vector of appropriate length then subset) when checking file.exists
on file.path(...) built from scores$forecast_date and output_path_multiloc to
build valid_dates, then return the subsampled sequence as before.

🧹 Nitpick comments (2)

R/plot_figures.R (1)
106-122: Consider adding a check for empty available_locations.

If no location directories exist at the forecast path, available_locations will be empty. While sample(x, size = 0) returns an empty vector without error, this would likely cause downstream failures. A clearer error message here would aid debugging.
Suggested improvement
     available_locations <- list.dirs(
       first_forecast_path,
       full.names = FALSE,
       recursive = FALSE
     )
+    if (length(available_locations) == 0) {
+      stop("No location directories found at: ", first_forecast_path)
+    }
     locations <- sample(
       available_locations,
       size = min(3, length(available_locations))
     )
targets/multilocation_plot_targets.R (1)
103-117: Random location selection affects reproducibility.

Using locations = NULL causes random selection of 3 locations, which means the plot output will vary between pipeline runs. If reproducibility is important, consider setting a seed or specifying explicit locations.
# Option: Set seed for reproducibility within the target
command = {
  set.seed(42)
  plot_multilocation_comparison(...)
}

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ae7278 and b5cd810.

⛔ Files ignored due to path filters (1)

renv.lock is excluded by !**/*.lock

📒 Files selected for processing (4)

NAMESPACE
R/plot_figures.R
man/plot_multilocation_comparison.Rd
targets/multilocation_plot_targets.R

🚧 Files skipped from review as they are similar to previous changes (2)

man/plot_multilocation_comparison.Rd
NAMESPACE

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: test-coverage
GitHub Check: lint-changed-files
GitHub Check: ubuntu-latest (release)
GitHub Check: macOS-latest (release)
GitHub Check: windows-latest (release)

🔇 Additional comments (13)

R/plot_figures.R (8)

1-41: LGTM - Clean theme implementation.

The lshtm_theme() function is well-documented and correctly implements a custom ggplot2 theme with LSHTM branding. The styling choices are consistent and the function follows ggplot2 conventions.

124-171: Data loading logic is sound.

The nested loop structure handles missing files gracefully with file.exists() checks. The approach of building lists and combining with bind_rows() later is appropriate.

Consider wrapping read_csv() calls in tryCatch() if malformed CSV files are a possibility in production.

173-211: LGTM - Future observations loading is well-handled.

The logic to find a forecast date approximately 8 weeks later and load wastewater observations gracefully handles the case where no such date exists.

213-298: Data processing and filtering logic is correct.

The approach of:

Combining forecasts via bind_rows()

Validating data presence with an informative error

Filtering to forecast period and specific model

De-duplicating observations

follows good data wrangling practices. The obs_timing categorisation (historical vs future) provides useful visual distinction in plots.

300-398: Wastewater data processing is thorough.

The approach correctly:

Filters to forecast period

Maps later observations to existing site names via merge

Handles de-duplication to avoid double-counting observations

The obs_timing distinction between historical and future wastewater observations maintains consistency with hospital data handling.

400-599: Plot composition with patchwork is well-designed.

The approach of creating separate plots per location and combining them with patchwork::wrap_plots() effectively avoids empty facet cells that would occur with a standard facet_grid(). The use of plot_spacer() to align columns across locations with differing wastewater site counts is a good solution.

The nested geom_ribbon() calls with alpha = 0.3 for both the 25-75% and 5-95% credible intervals create appropriate visual layering.

600-649: Hospital-only fallback plot is consistent.

The fallback when no wastewater data is available correctly mirrors the styling of the combined plot, maintaining visual consistency.

651-674: Save logic with dynamic sizing is practical.

The dynamic width calculation based on facet count, capped at 40 inches, is a sensible approach for multi-panel plots. The use of dir.create(..., recursive = TRUE, showWarnings = FALSE) appropriately handles existing directories.

targets/multilocation_plot_targets.R (5)

1-11: Path target definitions are clear.

The output path targets are straightforward and well-named.

43-47: LGTM.

Simple and clear extraction of unique locations from scores.

48-102: This target's output is unused due to upstream bug.

As noted in R/plot_figures.R, the hosp_data_long parameter of plot_multilocation_comparison() is never used. This target computes hosp_data_all_locations which is passed to that parameter but the data is discarded.

Once the upstream issue is resolved (either by using the parameter or removing it), this target may need adjustment accordingly.

118-136: Specific locations target is well-defined.

The locations_of_interest target provides explicit control over which locations to plot, and multilocation_specific_plots correctly uses this for reproducible output.

137-169: Single-date plot variants are appropriately configured.

The show_multiple_dates = FALSE parameter is correctly used to generate single-date variants. The save paths include clear subdirectory names (single_date) for organisation.

coderabbitai · 2026-01-13T15:33:40Z

R/plot_figures.R

+plot_multilocation_comparison <- function(
+    output_path,
+    forecast_dates,
+    locations = NULL,
+    hosp_data_long,
+    forecast_horizon_to_plot = 28,
+    historical_data_to_plot = 90,
+    scale_selected = "natural",
+    save_path = NULL,
+    show_multiple_dates = TRUE) {


⚠️ Potential issue | 🟠 Major

Unused parameter hosp_data_long - dead code.

The hosp_data_long parameter is declared and documented but never used within the function body. The function reads hospital data directly from CSV files (lines 136-160) instead of using this parameter.

This causes the hosp_data_all_locations target in multilocation_plot_targets.R to compute data that is never utilised.

Either:

Remove the parameter if it's not needed, or

Use the parameter instead of re-reading from files

Option 1: Remove unused parameter

plot_multilocation_comparison <- function( output_path, forecast_dates, locations = NULL, - hosp_data_long, forecast_horizon_to_plot = 28, historical_data_to_plot = 90, scale_selected = "natural", save_path = NULL, show_multiple_dates = TRUE) {

Also update the roxygen documentation to remove @param hosp_data_long.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

plot_multilocation_comparison <- function(

output_path,

forecast_dates,

locations = NULL,

hosp_data_long,

forecast_horizon_to_plot = 28,

historical_data_to_plot = 90,

scale_selected = "natural",

save_path = NULL,

show_multiple_dates = TRUE) {

plot_multilocation_comparison <- function(

output_path,

forecast_dates,

locations = NULL,

forecast_horizon_to_plot = 28,

historical_data_to_plot = 90,

scale_selected = "natural",

save_path = NULL,

show_multiple_dates = TRUE) {

🤖 Prompt for AI Agents

In @R/plot_figures.R around lines 85 - 94, The function plot_multilocation_comparison declares a unused parameter hosp_data_long but instead reads hospital CSVs inside the function; either remove the parameter and its @param roxygen entry and update callers (e.g., multilocation_plot_targets.R / remove computing hosp_data_all_locations) or replace the internal CSV-read block with usage of the hosp_data_long argument (match the expected columns/format and any filtering/grouping currently done by the CSV-read code) so the precomputed target is actually consumed; update roxygen and any callers accordingly.

coderabbitai · 2026-01-13T15:33:40Z

R/plot_figures.R

+  if (!show_multiple_dates) {
+    # Filter to 2024 dates if available
+    dates_2024 <- forecast_dates[grepl("^2024-", forecast_dates)]
+    if (length(dates_2024) > 0) {
+      forecast_dates <- sample(dates_2024, size = 1)
+    } else {
+      forecast_dates <- sample(forecast_dates, size = 1)
+    }
+  }


⚠️ Potential issue | 🟡 Minor

Add guard for empty forecast_dates input.

If forecast_dates is passed as an empty vector, sample(forecast_dates, size = 1) at line 102 will throw an error. Consider adding an early validation check.

Suggested guard

plot_multilocation_comparison <- function( output_path, forecast_dates, ... show_multiple_dates = TRUE) { + # Validate inputs + if (length(forecast_dates) == 0) { + stop("forecast_dates must not be empty") + } + # If show_multiple_dates is FALSE, randomly select one forecast date if (!show_multiple_dates) {

🤖 Prompt for AI Agents

In @R/plot_figures.R around lines 96 - 104, The code assumes forecast_dates is non-empty when sampling; add an early validation check in the block that handles !show_multiple_dates to detect empty forecast_dates and return or throw a clear error (or set a default) before any sample() calls. Specifically, in the function or scope using the forecast_dates and show_multiple_dates variables, check length(forecast_dates) == 0 and handle it (e.g., stop("forecast_dates is empty") or set forecast_dates to a safe default) so that the subsequent uses of sample(forecast_dates, size = 1) (and the dates_2024 logic) cannot attempt to sample from an empty vector.

coderabbitai · 2026-01-13T15:33:40Z

targets/multilocation_plot_targets.R

+  tar_target(
+    name = selected_forecast_dates_multiloc,
+    command = {
+      dates <- unique(scores$forecast_date)
+      dates <- sort(dates)
+
+      # Filter to only dates with complete data
+      valid_dates <- c()
+      test_loc <- available_locations_multiloc[1]
+      for (test_date in dates) {
+        date_str <- as.character(as.Date(test_date, origin = "1970-01-01"))
+        test_path <- file.path(
+          output_path_multiloc,
+          "individual_forecasts_all_runs",
+          date_str,
+          test_loc,
+          "data",
+          "hosp_quantiles_ww_TRUE.csv"
+        )
+        if (file.exists(test_path)) {
+          valid_dates <- c(valid_dates, date_str)
+        }
+      }
+
+      # Return every other valid date for clarity
+      valid_dates[seq(1, length(valid_dates), by = 2)]
+    }
+  ),


⚠️ Potential issue | 🟠 Major

Target ordering dependency may cause issues.

This target references available_locations_multiloc[1] at line 23, but available_locations_multiloc is defined later in the list (lines 44-47). In targets, the order of target definitions in a list doesn't guarantee execution order — that's determined by the dependency graph. Since this target doesn't explicitly depend on available_locations_multiloc, the reference may fail.

Additionally, the vector-growing pattern valid_dates <- c(valid_dates, date_str) is O(n²) due to reallocation. Consider using vapply or pre-allocating.

Fix dependency and improve efficiency

tar_target( name = selected_forecast_dates_multiloc, command = { + test_loc <- available_locations_multiloc[1] dates <- unique(scores$forecast_date) dates <- sort(dates) # Filter to only dates with complete data - valid_dates <- c() - test_loc <- available_locations_multiloc[1] - for (test_date in dates) { - date_str <- as.character(as.Date(test_date, origin = "1970-01-01")) - test_path <- file.path( - output_path_multiloc, - "individual_forecasts_all_runs", - date_str, - test_loc, - "data", - "hosp_quantiles_ww_TRUE.csv" - ) - if (file.exists(test_path)) { - valid_dates <- c(valid_dates, date_str) - } - } + date_strs <- as.character(as.Date(dates, origin = "1970-01-01")) + valid_dates <- date_strs[vapply(date_strs, function(date_str) { + test_path <- file.path( + output_path_multiloc, + "individual_forecasts_all_runs", + date_str, + test_loc, + "data", + "hosp_quantiles_ww_TRUE.csv" + ) + file.exists(test_path) + }, logical(1))] # Return every other valid date for clarity valid_dates[seq(1, length(valid_dates), by = 2)] } ),

This ensures available_locations_multiloc is referenced at the start of the command block, establishing the dependency.

🤖 Prompt for AI Agents

In @targets/multilocation_plot_targets.R around lines 15 - 42, The tar target selected_forecast_dates_multiloc currently uses available_locations_multiloc[1] without declaring a dependency and also grows valid_dates with repeated c() calls; fix by referencing available_locations_multiloc at the top of the command block (e.g., assign test_loc <- available_locations_multiloc[1] immediately) so targets sees the dependency, and replace the O(n^2) growth pattern by collecting valid paths with a vectorized or preallocated approach (use vapply/lapply + Filter or preallocate a character vector of appropriate length then subset) when checking file.exists on file.path(...) built from scores$forecast_date and output_path_multiloc to build valid_dates, then return the subsampled sequence as before.

ciaramccarthy1 added 2 commits January 12, 2026 11:22

add LSHTM theme

077432b

plot figure 3

1ae7278

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

ciaramccarthy1 added 2 commits January 13, 2026 11:36

add one forecast date option

d06f070

coderabbit suggestions

b5cd810

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

ciaramccarthy1 added 12 commits January 13, 2026 15:38

avoid CI R version issue

4e54cae

coderabbit suggestions

e849605

fix linting

603812a

update wordlist

90e5a85

breaking down large plot function

ea7435b

fixing line lengths

422dbd6

figure 4 first draft

e3fb560

add helper functions

02f7287

updating documentation

83763d5

update spelling

e5009b4

add return

2112491

fix linting

33360f8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure functions #108

Figure functions #108

Uh oh!

ciaramccarthy1 commented Jan 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 12, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 12, 2026

Uh oh!

coderabbitai bot Jan 12, 2026

Uh oh!

ciaramccarthy1 commented Jan 13, 2026

Uh oh!

coderabbitai bot commented Jan 13, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 13, 2026

Uh oh!

coderabbitai bot Jan 13, 2026

Uh oh!

coderabbitai bot Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Figure functions #108

Are you sure you want to change the base?

Figure functions #108

Uh oh!

Conversation

ciaramccarthy1 commented Jan 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

ciaramccarthy1 commented Jan 13, 2026

Uh oh!

coderabbitai bot commented Jan 13, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ciaramccarthy1 commented Jan 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 12, 2026 •

edited

Loading