Skip to content

Conversation

@kaitejohnson
Copy link
Collaborator

@kaitejohnson kaitejohnson commented Dec 29, 2025

This PR addresses #106 . It does the following:

  • renames the original _targets.R file to _targets_model_run.R
  • creates a new _targets.R file which will just rely on the locally saved outputs in output/ to generate figures and analysis the forecast performance across the locations and forecast dates for which the model has already been run
  • outlines the analysis targets file _targets.R
  • initially moved the wastewater metadata table creation to this targets file, since there may be additional characteristics/variables we want to extract from the data for the forecast performance analysis (but after realizing this is slow, moved it back and have it loading in in the _targets.R pipeline. @ciaramccarthy1 Let me know what you think of this, alternatively, we just make it its own separate pipeline so that we can edit if to add things if needed but don't need to rerun it if we ever need to wipe the analysis targets clean

Still to do (could be in a separate PR)

  • create plotting style for the figures
  • write function to create model names for figures
  • do some EDA of the scores

Suggestions for testing:

Once the output data is in the correct folder in outputs/, run tar_make() and ensure that tar_load(plot_scores_by_date) and tar_load(scatterplot_scores) generates figures

Summary by CodeRabbit

Release Notes

  • New Features

    • Added scatterplot visualisation for comparing wastewater plus hospital data against hospital-only data across forecast dates and locations.
    • Introduced restructured analysis pipeline with new configuration-driven workflow stages for improved organisation.
  • Documentation

    • Updated parameter naming in score conversion documentation for clarity.
  • Refactor

    • Reorganised pipeline structure to streamline analysis phases and enhance modularity.
    • Updated score processing and aggregation logic.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 29, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The pull request introduces a restructured analysis pipeline that separates model execution from analysis workflows. It adds new exploratory data analysis visualisation functions, reorganises targets configuration into a config-driven structure, renames parameters in score conversion utilities, and updates global variable declarations to support the new analysis pipeline.

Changes

Cohort / File(s) Summary
EDA visualisation functions
R/EDA_plots.R, man/get_scatterplot_scores.Rd
Adds new get_scatterplot_scores() function for scatterplot visualisation and renames intermediate variable in existing get_plot_scores_by_date(). New documentation file created.
Score data handling
R/convert_to_su_object.R, man/convert_to_su_object.Rd
Renames parameter from scores_data to scores_raw, reorders metrics list (places wis first), updates documentation accordingly.
Global variable declarations
R/globals.R
Adds seven new global variable annotations: model, include_ww, wis, forecast_date, location, hosp_only, ww_plus_hosp.
Pipeline restructuring
_targets.R, _targets_model_run.R
Replaces low-level pipeline structure with config-driven analysis workflow. Introduces analysis_config, get_metadata, secondary_outputs, plot_targets as public entities; removes set_up, load_data, fit_models, scoring. Separates model execution into dedicated _targets_model_run.R.
New score processing and analysis targets
targets/analysis_config_targets.R, targets/get_metadata_targets.R, targets/run_gam_targets.R, targets/analysis_EDA_plot_targets.R, R/prep_scores_to_model.R
Introduces modular analysis targets: configuration setup with ww_data_post and scores_fp; metadata computation with updated source references (ww_data_post instead of ww_data); GAM preparation pipeline (read CSV, convert to SU object, prepare for modelling); EDA plot generation. Adds placeholder prep_scores_to_model() function.
Exploratory scripts
scratch/explore_results.R
New exploratory script for analysing scores across models, including WIS aggregation and visualisation by various groupings.

Possibly related issues

  • Issue #106: Implements the separation between model-run and analysis pipelines through introduction of dedicated analysis-oriented targets files and a separate _targets_model_run.R orchestration script.

Possibly related PRs

  • PR #95: Depends on the wastewater metadata targets and variables (ww_metadata_table, ww_metadata_table_combined) that are refined and integrated in this PR.
  • PR #83: Both PRs modify R/EDA_plots.R to add plotting functions and update R/globals.R with overlapping global variable declarations for forecast-related fields.
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Issue 106: analysis pipeline' is vague and generic, using non-descriptive terms that do not clearly convey the main changes without reading the PR description. Consider a more descriptive title that captures the primary change, such as 'Add analysis pipeline for forecast performance evaluation' or 'Restructure pipeline to separate model runs from analysis'.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 105-analysis-pipeline

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@kaitejohnson kaitejohnson changed the title Issue 105: analysis pipeline Issue 106: analysis pipeline Dec 29, 2025
@kaitejohnson kaitejohnson marked this pull request as ready for review January 6, 2026 13:48
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
_targets.R (1)

39-42: Invalid regex pattern in grep (same issue as in _targets_model_run.R).

The pattern *\\.R is incorrect regex syntax. Consider extracting this shared logic into a helper function to avoid duplication and ensure consistency.

🔎 Proposed fix
 # load target modules
 targets <- list.files(here("targets"), full.names = TRUE)
-targets <- grep("*\\.R", targets, value = TRUE)
+targets <- grep("\\.R$", targets, value = TRUE)
 purrr::walk(targets, source)
🤖 Fix all issues with AI Agents
In @_targets_model_run.R:
- Around line 34-37: The grep call uses an invalid regex "*\\.R" which can fail;
update the selection to a correct regex (e.g., use grep("\\.R$", targets, value
= TRUE)) or better yet use list.files(..., pattern = "\\.R$", full.names = TRUE)
to only return .R files; adjust the existing targets variable usage (the grep
call that assigns to targets) so purrr::walk(targets, source) receives the
proper .R file paths.
- Around line 31-32: n_workers can become zero when future::availableCores() <
4; update the n_workers assignment so it never drops below 1 (compute the
original floor(as.integer(availableCores()/4)) and then enforce a minimum of 1)
and keep using plan(multisession, workers = n_workers) so the multisession plan
always has at least one worker.

In @R/EDA_plots.R:
- Around line 290-322: The roxygen import is wrong and the pivot keys are
inconsistent: update the @importFrom line to include the ggplot2 geoms actually
used (geom_point and geom_line) instead of geom_bar, and fix the pivot by
ensuring the id columns in pivot_wider match the grouping from summarise_scores
— either include "hosp_data_real_time" in id_cols or remove it from the
summarise_scores by= vector; locate get_scatterplot_scores, the summarise_scores
call and the pivot_wider call and make the id_cols =
c("forecast_date","location","hosp_data_real_time") (or drop hosp_data_real_time
from by=) and change the roxygen to @importFrom ggplot2 geom_point geom_line so
the imports and pivot behavior align.

In @R/prep_scores_to_model.R:
- Around line 1-9: prep_scores_to_model currently returns NULL and breaks
downstream targets; implement it to pivot scores_long from long to wide (e.g.,
each sample/feature combination becomes a row with score columns) and then join
ww_metadata by the appropriate key (sample id or site id used in scores_long)
before returning a cleaned data.frame/tibble; update the function
prep_scores_to_model to accept scores_long and ww_metadata, use pivot_wider (or
spread) on the score name column and values column, perform a left_join with
ww_metadata using the matching id column present in both, ensure column names
are consistent with the caller scores_to_model in targets/run_gam_targets.R, and
return the resulting joined table instead of NULL.

In @scratch/explore_results.R:
- Line 5: The summarise call computes wis with mean(wis) which will return NA if
any wis values are missing; update the aggregation in the summarise step (the
summarise function that defines wis) to call mean(wis, na.rm = TRUE) so missing
values are ignored and downstream plotting won't fail.
- Line 15: The division uses a hardcoded index scores_overall$wis[2], which is
brittle; update the calculation of rwis to use a named baseline lookup instead
(e.g., find the row in scores_overall whose model or baseline column equals the
intended baseline identifier and use its wis value) so that rwis <-
scores_overall$wis / <baseline_wis> uses a stable reference (referencing
scores_overall and the column name used for model/baseline to locate the
baseline row instead of index 2).

In @targets/run_gam_targets.R:
- Around line 10-16: The tar target scores_to_model is broken because it calls
prep_scores_to_model which is currently a stub returning NULL and it references
a misspelled dependency ww_metadata_table_combines; change the target call to
use the correct ww_metadata_table_combined and implement prep_scores_to_model
(in R/prep_scores_to_model.R) to accept scores_long and ww_metadata, validate
inputs, join/transform the data into a non-NULL tidy data.frame or list suitable
for downstream modeling (e.g., merged scores with metadata and any necessary
cleaning), and return that object; ensure the function name prep_scores_to_model
and the target scores_to_model remain unchanged so the pipeline wiring stays
consistent.
🧹 Nitpick comments (4)
scratch/explore_results.R (2)

10-13: Remove position = "stack" for single-value bars.

The position = "stack" parameter is redundant when each bar represents a single value (one wis per model_ww). Use position = "dodge" if multiple aesthetics map to the same x-value, or remove the parameter entirely.

🔎 Proposed fix
 ggplot(scores_overall) +
   geom_bar(aes(x = model_ww, y = wis, fill = model_ww),
-    stat = "identity", position = "stack"
+    stat = "identity"
   )

3-30: Consider extracting the repeated aggregation pattern into a helper function.

The same aggregation pattern (group by model/include_ww/additional dimensions, summarise wis, add model_ww label) is repeated three times. Extracting this into a helper function would reduce duplication and improve maintainability.

🔎 Example refactor
aggregate_scores <- function(scores, ...) {
  scores |>
    group_by(model, include_ww, ...) |>
    summarise(wis = mean(wis, na.rm = TRUE), .groups = "drop") |>
    mutate(model_ww = glue::glue("{model}-{include_ww}"))
}

scores_overall <- aggregate_scores(scores)
scores_by_date <- aggregate_scores(scores, forecast_date)
scores_by_loc <- aggregate_scores(scores, location)
targets/analysis_config_targets.R (1)

13-16: Consider validating that the scores file exists.

The scores_fp target defines a hardcoded path to the scores CSV file. According to the PR description, this pipeline relies on locally saved outputs in output/. If the file does not exist, downstream targets that depend on scores_fp may fail. Consider adding validation to check file existence or documenting the prerequisite that model runs must be completed first.

🔎 Proposed validation approach
 tar_target(
   name = scores_fp,
-  command = file.path("output", "overall_data_all_runs", "scores.csv")
+  command = {
+    fp <- file.path("output", "overall_data_all_runs", "scores.csv")
+    if (!file.exists(fp)) {
+      stop("Scores file not found. Please run the model pipeline first.")
+    }
+    fp
+  }
 )
_targets.R (1)

82-97: Placeholder comments for incomplete features.

The comments at lines 84, 86, and 92-97 contain placeholders (empty parentheses, question marks, and TODOs for future figures). This aligns with the PR's stated remaining tasks. Consider converting these to proper TODO comments (e.g., # TODO: compute coverage metrics) for easier tracking.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 672a9b1 and 8a90410.

📒 Files selected for processing (13)
  • R/EDA_plots.R
  • R/convert_to_su_object.R
  • R/globals.R
  • R/prep_scores_to_model.R
  • _targets.R
  • _targets_model_run.R
  • man/convert_to_su_object.Rd
  • man/get_scatterplot_scores.Rd
  • scratch/explore_results.R
  • targets/analysis_EDA_plot_targets.R
  • targets/analysis_config_targets.R
  • targets/get_metadata_targets.R
  • targets/run_gam_targets.R
🔇 Additional comments (16)
man/get_scatterplot_scores.Rd (1)

1-17: LGTM!

The documentation structure is standard and correctly describes the function interface. As this is a Roxygen2-generated file, ensure the source documentation in R/EDA_plots.R is kept up to date.

targets/run_gam_targets.R (1)

1-9: LGTM!

The pipeline structure correctly chains reading raw scores and converting them to a scoring utilities object. The use of tar_target with explicit names and commands follows targets best practices. The scores_fp variable is properly defined in analysis_config_targets.R as a tar_target that generates the file path to scores.csv.

scratch/explore_results.R (1)

1-1: No action required—the file path in explore_results.R is correct.

The path output/overall_data/scores.csv matches what the pipeline's scoring step actually produces (as defined in targets/scoring_targets.R line 48), so the script will read from the correct location. Although the analysis configuration in targets/analysis_config_targets.R references output/overall_data_all_runs/scores.csv, that represents a separate expectation; explore_results.R is an exploratory script correctly pointing to the scoring pipeline's actual output location.

Likely an incorrect or invalid review comment.

targets/analysis_EDA_plot_targets.R (1)

1-10: LGTM!

The EDA plotting targets are well-structured with clear dependencies on the scores target from the pipeline. Both plotting functions (get_plot_scores_by_date and get_scatterplot_scores) are properly defined in R/EDA_plots.R, and the target is correctly configured in the secondary outputs. The naming convention is consistent and the pipeline flow is logical.

R/EDA_plots.R (1)

269-275: Excellent refactoring of variable name.

The variable rename from scores_by_loc to scores_by_date correctly reflects that the data is aggregated by forecast_date rather than by location, improving code clarity.

man/convert_to_su_object.Rd (1)

1-17: LGTM!

The documentation correctly reflects the parameter rename from scores_data to scores_raw, maintaining consistency with the function implementation.

targets/get_metadata_targets.R (2)

8-17: LGTM!

The updates to use ww_data_post instead of ww_data correctly align with the new analysis pipeline structure where ww_data_post is defined in analysis_config_targets.R.


22-23: The use of distinct() is correct as-is.

The duplicate rows are expected and legitimate: multiple scenarios with different include_ww, model, and hosp_data_real_time values map to the same (location, forecast_date) pair, resulting in identical metadata calculations. Since calculate_ww_metadata_table() returns one row per (forecast_date, location_abbr, location_name) combination and all columns derive from this grouping, using distinct() without specifying key columns is appropriate—any duplicates will be identical across all columns.

targets/analysis_config_targets.R (1)

2-12: No action required.

The scenarios object is properly defined in targets/create_permutations_targets.R (lines 122–137) as a tar_group_by() target that creates a grouped dataset with all required fields: location_name, location_abbr, and forecast_date. The pipeline execution order in _targets.R ensures that create_permutations_targets is evaluated before analysis_config_targets, so the ww_data_post target's dependency on scenarios is correctly satisfied.

R/convert_to_su_object.R (1)

3-21: Parameter and variable naming improvements look good.

The parameter rename from scores_data to scores_raw improves clarity by indicating raw input data. The approach of setting the "metrics" attribute on the scores object is appropriate; the metrics attribute order does not affect functionality in the scoringutils package.

R/globals.R (1)

34-40: LGTM!

The new global variable annotations follow the established pattern in this file and correctly document the variables used by get_scatterplot_scores. The duplicates with other sections are acceptable as they serve per-function documentation purposes.

_targets_model_run.R (2)

39-60: LGTM!

The tar_option_set configuration is well-structured with sensible defaults for worker-based storage, transient memory, and parquet format.


62-100: LGTM!

The pipeline structure is well-organised with clear separation between setup, data loading, model fitting, and scoring stages. The modular approach using sourced target definitions promotes maintainability.

_targets.R (3)

1-10: LGTM!

The header documentation clearly explains the pipeline's purpose and its expected input structure from the output/ folder. This is helpful for users running the analysis pipeline.


67-75: LGTM!

The analysis_config section clearly combines permutation and configuration targets with helpful inline comments explaining each component's purpose.


99-104: LGTM!

The final pipeline structure is well-organised and clearly separates configuration, metadata extraction, secondary computations, and plotting into distinct logical groups.

Comment on lines +1 to +9
prep_scores_to_model <- function(scores_long,
ww_metadata) {
# Pivot scores from long to wide

# Join wastewater metadata

# Placeholder
return(NULL)
}
Copy link

@coderabbitai coderabbitai bot Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Implement the placeholder function before merging.

This function is called by scores_to_model in targets/run_gam_targets.R but currently returns NULL, which will break the downstream pipeline. The PR description mentions remaining tasks, but this function should be implemented or the calling target should be removed before merging.

Would you like me to generate a basic implementation template for pivoting scores and joining metadata, or should this target be temporarily removed from the pipeline?

🤖 Prompt for AI Agents
In @R/prep_scores_to_model.R around lines 1 - 9, prep_scores_to_model currently
returns NULL and breaks downstream targets; implement it to pivot scores_long
from long to wide (e.g., each sample/feature combination becomes a row with
score columns) and then join ww_metadata by the appropriate key (sample id or
site id used in scores_long) before returning a cleaned data.frame/tibble;
update the function prep_scores_to_model to accept scores_long and ww_metadata,
use pivot_wider (or spread) on the score name column and values column, perform
a left_join with ww_metadata using the matching id column present in both,
ensure column names are consistent with the caller scores_to_model in
targets/run_gam_targets.R, and return the resulting joined table instead of
NULL.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to ignore this for a subsequent PR

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaitejohnson, understood! Would you like me to open a follow-up issue to track the implementation of prep_scores_to_model?


scores_overall <- scores |>
group_by(model, include_ww) |>
summarise(wis = mean(wis)) |>
Copy link

@coderabbitai coderabbitai bot Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add na.rm = TRUE to the mean() call.

If the wis column contains missing values, the aggregation will return NA for all groups, causing downstream plotting to fail.

🔎 Proposed fix
-  summarise(wis = mean(wis)) |>
+  summarise(wis = mean(wis, na.rm = TRUE)) |>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
summarise(wis = mean(wis)) |>
summarise(wis = mean(wis, na.rm = TRUE)) |>
🤖 Prompt for AI Agents
In @scratch/explore_results.R at line 5, The summarise call computes wis with
mean(wis) which will return NA if any wis values are missing; update the
aggregation in the summarise step (the summarise function that defines wis) to
call mean(wis, na.rm = TRUE) so missing values are ignored and downstream
plotting won't fail.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alos ignoring stuff in scratch

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaitejohnson, understood! Since scratch/ contains exploratory code, it makes sense to skip detailed reviews there.

If you'd like to prevent automated reviews on the scratch/ directory in future PRs, you can configure CodeRabbit to exclude it by adding a path_filters configuration in your .coderabbit.yaml file. For example:

reviews:
  path_filters:
    - "!scratch/**"

This will exclude all files under scratch/ from automated reviews.

Comment on lines 10 to 16
tar_target(
name = scores_to_model,
command = prep_scores_to_model(
scores_long = scores,
ww_metadata = ww_metadata_table_combines
)
)
Copy link

@coderabbitai coderabbitai bot Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Check for prep_scores_to_model implementation and its return value
find . -name "prep_scores_to_model.R" -o -name "*prep_scores*" | head -20

Repository: epiforecasts/evalwwforecasts

Length of output: 98


🏁 Script executed:

# Verify ww_metadata_table_combines is defined as a target
rg -n "ww_metadata_table_combines" targets/

Repository: epiforecasts/evalwwforecasts

Length of output: 147


🏁 Script executed:

# Look for the function definition in R directory
rg -n "prep_scores_to_model" R/ --max-count 5

Repository: epiforecasts/evalwwforecasts

Length of output: 144


🏁 Script executed:

# Read the prep_scores_to_model function to check return value
cat -n R/prep_scores_to_model.R

Repository: epiforecasts/evalwwforecasts

Length of output: 327


🏁 Script executed:

# Search more broadly for ww_metadata_table_combines definition
rg -n "ww_metadata_table_combines\s*=" targets/ -A 5

Repository: epiforecasts/evalwwforecasts

Length of output: 54


🏁 Script executed:

# Check if it's defined in other files
rg -n "ww_metadata_table_combines\s*=" . --include="*.R" -A 3

Repository: epiforecasts/evalwwforecasts

Length of output: 513


🏁 Script executed:

# Search entire repo for ww_metadata_table_combines with corrected syntax
rg "ww_metadata_table_combines" . -A 2

Repository: epiforecasts/evalwwforecasts

Length of output: 212


🏁 Script executed:

# Search for similar variable names that might be the intended target
rg "ww_metadata" targets/ -n

Repository: epiforecasts/evalwwforecasts

Length of output: 543


🏁 Script executed:

# List all tar_target definitions to see what targets are actually defined
rg "tar_target\(" targets/ -A 2 | head -60

Repository: epiforecasts/evalwwforecasts

Length of output: 2691


🏁 Script executed:

# Final check: search entire repo for any other references to ww_metadata_table_combines
rg "ww_metadata_table_combines" . --count

Repository: epiforecasts/evalwwforecasts

Length of output: 101


🏁 Script executed:

# Verify the exact name of the defined target
rg "name = ww_metadata_table_combined" targets/ -A 3

Repository: epiforecasts/evalwwforecasts

Length of output: 319


Fix target name typo and implement prep_scores_to_model function.

Two critical issues prevent this target from functioning:

  1. prep_scores_to_model returns NULL (R/prep_scores_to_model.R) — the function is a placeholder with no actual implementation.
  2. The target parameter references ww_metadata_table_combines, but the actual defined target is ww_metadata_table_combined — this typo will cause a missing dependency error at runtime.

Update the target to use the correct parameter name and implement the function to return usable output.

🤖 Prompt for AI Agents
In @targets/run_gam_targets.R around lines 10 - 16, The tar target
scores_to_model is broken because it calls prep_scores_to_model which is
currently a stub returning NULL and it references a misspelled dependency
ww_metadata_table_combines; change the target call to use the correct
ww_metadata_table_combined and implement prep_scores_to_model (in
R/prep_scores_to_model.R) to accept scores_long and ww_metadata, validate
inputs, join/transform the data into a non-NULL tidy data.frame or list suitable
for downstream modeling (e.g., merged scores with metadata and any necessary
cleaning), and return that object; ensure the function name prep_scores_to_model
and the target scores_to_model remain unchanged so the pipeline wiring stays
consistent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a placeholder function to be filled in!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaitejohnson, understood! Thanks for clarifying that the placeholder function is intentional.

Just a heads up: there's still the typo issue on Line 14 — ww_metadata = ww_metadata_table_combines should be ww_metadata = ww_metadata_table_combined (missing the d at the end). That typo will cause a dependency error when targets tries to run this pipeline, so you'll want to fix that before executing the workflow.

Copy link
Contributor

@ciaramccarthy1 ciaramccarthy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked for me after adding the brackets around TRUE and FALSE in the get_scatterplot_scores function. I noticed that the forecast dates in create_permutations_targets don't go up to 2025-07-07 (as it says they should do in the comment on lines 26-27) - not sure if this is intentional? Also, what are you thinking re coderabbit's suggestions?

R/EDA_plots.R Outdated
Comment on lines 313 to 314
ww_plus_hosp = TRUE,
hosp_only = FALSE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added brackets here to avoid this error: Error in rename(): ! Can't rename columns with TRUE. ✖ TRUE must be numeric or character, not TRUE.

@kaitejohnson
Copy link
Collaborator Author

kaitejohnson commented Jan 8, 2026

This worked for me after adding the brackets around TRUE and FALSE in the get_scatterplot_scores function. I noticed that the forecast dates in create_permutations_targets don't go up to 2025-07-07 (as it says they should do in the comment on lines 26-27) - not sure if this is intentional? Also, what are you thinking re coderabbit's suggestions?

Ooh ok I can't remember why but I think this should only go to June 30th, 2025 -- thats all that I have outputs for!

Have gone through the coderabbit suggestions, fixed the rename, and fixed some bugs in the file.paths!

@ciaramccarthy1
Copy link
Contributor

Re the metadata table @kaitejohnson, the analysis pipeline is running much more quickly now so I think its okay to leave it as is (versus having that in a separate pipeline).

@kaitejohnson kaitejohnson merged commit 05a986c into main Jan 9, 2026
6 of 7 checks passed
@kaitejohnson kaitejohnson deleted the 105-analysis-pipeline branch January 9, 2026 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants