Add prediction interval coverage panel to zoom_25A figure#52
Add prediction interval coverage panel to zoom_25A figure#52kaitejohnson merged 11 commits intomainfrom
Conversation
This PR addresses issue #51 by enhancing the zoom_25A figure with the following changes: - Add new Panel D with prediction interval coverage (1 row, 3 location columns) - Compute 50% and 95% interval coverage using scoringutils - Plot coverage trends by model and location with nominal reference lines - New functions: compute_coverage_25A() and get_plot_coverage_by_date() - Restore 25A model legend in main predictions panel - Convert 25A proportions from weekly to daily observations - Fix x-axis alignment across all panels by standardizing facet layouts - Update figure layout from 5 to 6 rows to accommodate new coverage panel - Fix .lintr configuration issue with return_linter 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis pull request adds prediction interval coverage computation and visualization functionality to the 25A emergence figure. It introduces a new compute_coverage_25A function, a new coverage plotting function, updates plotting logic to use daily observations instead of weekly, and integrates these components into the targets pipeline with supporting documentation. Changes
Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…tes for each location
Changes: - Modified get_plot_coverage_by_date() to create bar chart instead of line plot - Shows mean coverage across all nowcast dates for each model - Uses dodged bars to display 50% and 95% intervals side-by-side - Adds horizontal reference lines for nominal coverage (0.5 and 0.95) - One row with 3 columns (one per location) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
man/compute_coverage_25A.Rd (1)
9-15: Missing intervals parameter documentation.The
intervalsparameter from the function signature is not documented in the arguments section. This is a duplicate of the issue flagged in R/compute_bias.R and will be resolved when that is fixed.
🧹 Nitpick comments (3)
R/compute_bias.R (1)
91-92: Update return value documentation to reflect flexible intervals.The documentation states the function returns "coverage for 50% and 95% intervals", but the
intervalsparameter allows specifying different values. Consider updating the documentation to be more general.Apply this diff:
-#' @returns Data frame with coverage for 50% and 95% intervals by model, -#' location, and nowcast_date +#' @returns Data frame with coverage for specified intervals (default 50% and 95%) +#' by model, location, and nowcast_dateR/fig_zoom_clade_mult_nowcasts.R (2)
19-26: Add explicitbyargument toleft_jointo avoid ambiguity warnings.The
left_joinat line 25 does not specify the join columns explicitly. This can produce warnings and may lead to unexpected behaviour if column names change.- daily_obs <- left_join(daily_obs_data, total_seq) |> + daily_obs <- left_join(daily_obs_data, total_seq, by = c("date", "location")) |>
256-258: Consider usingmean()instead ofsum() / n().The expression
sum(interval_coverage) / n()computes the mean. Usingmean(interval_coverage)directly is clearer and handles edge cases (e.g.,NAvalues withna.rm = TRUEif needed).- summarise(empirical_coverage = sum(interval_coverage) / n()) |> + summarise(empirical_coverage = mean(interval_coverage), .groups = "drop") |>
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
.lintr(1 hunks)NAMESPACE(1 hunks)R/compute_bias.R(1 hunks)R/fig_zoom_clade_mult_nowcasts.R(9 hunks)R/globals.R(2 hunks)R/plotting_style.R(2 hunks)docs/supplement.qmd(2 hunks)man/compute_coverage_25A.Rd(1 hunks)man/get_fig_zoom_25A.Rd(2 hunks)man/get_plot_bias_by_date.Rd(1 hunks)man/get_plot_coverage_overall.Rd(1 hunks)targets/fig_zoom_25A_targets.R(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint-changed-files
- GitHub Check: test-coverage
🔇 Additional comments (12)
.lintr (1)
9-9: LGTM!Disabling the return_linter is a reasonable style preference for this codebase.
R/plotting_style.R (1)
117-131: LGTM!The alpha mapping for prediction intervals follows good visual design principles, with narrower intervals (50%) being more opaque than wider intervals (95%). This helps distinguish interval coverage levels effectively in the plots.
man/get_fig_zoom_25A.Rd (1)
10-20: LGTM!The documentation correctly reflects the transition from bias to coverage plotting, aligning with the PR objectives.
R/compute_bias.R (1)
99-128: LGTM with verification needed!The function logic is well-structured and follows the same pattern as
compute_bias_25A(). The filtering, conversion to forecast object, and coverage computation are implemented correctly. However, please verify theget_coverage()function usage in relation to the NAMESPACE import issue flagged separately.docs/supplement.qmd (2)
30-30: LGTM!The caption simplification improves clarity.
40-44: LGTM!The new "Additional results" section appropriately organises the supplementary figures, including the bias plot that was moved from the main figure to the supplement as part of this PR's restructuring.
R/globals.R (1)
58-60: LGTM!The global variable declarations for
compute_coverage_25Aandget_plot_coverage_overallare correctly added to support the new coverage functionality. The use of"95"and"50"as column names is consistent with thepivot_wider/pivot_longeroperations in the plotting function.Also applies to: 141-147
targets/fig_zoom_25A_targets.R (2)
50-64: LGTM!The new targets for coverage computation and plotting are correctly wired into the pipeline. The
coverage_25A_scorestarget appropriately reuses the prepared data frombias_25A_prepared, andplot_coverage_overallcorrectly depends on the computed coverage scores.
38-49: Verify: Unusedplot_bias_by_datetarget retained intentionally?The
plot_bias_by_datetarget (lines 38-49) is still computed but no longer passed tofig_zoom_25A(which now usescoverageinstead ofbias). Based on the updatedget_plot_bias_by_datefunction which saves to a "supp" directory, this appears intentional for supplementary output. Please confirm this is the desired behaviour to avoid unnecessary computation if the bias plot is no longer needed.Also applies to: 65-73
R/fig_zoom_clade_mult_nowcasts.R (3)
30-94: LGTM!The switch to daily observations, updated aesthetics with
show.legend = FALSEon ribbons, and the refined legend handling for the "25A" clade are well implemented. Thecoord_cartesian(ylim = c(0, 1))correctly constrains the y-axis for proportion data.
330-373: LGTM!The updated
get_fig_zoom_25Afunction correctly integrates the coverage plot in place of the bias plot, with appropriate layout configuration and output path handling.
338-351: No action required. Theaxes = "collect_x"parameter is a valid option in patchwork'splot_layout()function, which removes duplicated x-axes across columns. The code is correct as written.
| plot_name = "bias_over_time_25A", | ||
| output_fp = file.path("output", "figs", "zoom_25A", "supp") |
There was a problem hiding this comment.
Add documentation for new parameters.
The plot_name and output_fp parameters are added to the function signature but lack corresponding documentation in the \arguments section (lines 16-24).
The documentation should include:
\item{plot_name}{Name of the plot file (default: "bias_over_time_25A")}
\item{output_fp}{Output file path directory (default: file.path("output", "figs", "zoom_25A", "supp"))}
🤖 Prompt for AI Agents
In man/get_plot_bias_by_date.Rd around lines 12 to 13, the new function
parameters plot_name and output_fp are present in the signature but missing from
the \arguments section; add two \item entries to the \arguments block
documenting them: one for plot_name describing it as the name of the plot file
with default "bias_over_time_25A", and one for output_fp describing it as the
output file path directory with default file.path("output", "figs", "zoom_25A",
"supp").
| \arguments{ | ||
| \item{locs}{Vector of character strings of locations} | ||
|
|
||
| \item{coverage_data}{Data.frame of coverage scores with interval_range} | ||
| } |
There was a problem hiding this comment.
Parameter name mismatch in documentation.
The function signature uses coverage (line 7), but the \arguments section documents coverage_data (line 12) instead. The actual coverage parameter is not documented. This will cause confusion for users and indicates the roxygen comments in the source file need correction.
The roxygen comment in R/fig_zoom_clade_mult_nowcasts.R should be updated:
-#' @param coverage_data Data.frame of coverage scores with interval_range
+#' @param coverage Data.frame of coverage scores with interval_rangeCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In man/get_plot_coverage_overall.Rd around lines 9–13 the documented parameter
name `coverage_data` does not match the function signature which uses
`coverage`; update the documentation to document `coverage` (e.g., replace the
`\item{coverage_data}{...}` entry with `\item{coverage}{Vector/data.frame
description...}`) and remove the incorrect `coverage_data` entry; also update
the roxygen block in R/fig_zoom_clade_mult_nowcasts.R to use the same `coverage`
parameter name and matching description/type so the generated Rd and the
function signature stay consistent.
NAMESPACE
Outdated
| importFrom(scoringutils,add_relative_skill) | ||
| importFrom(scoringutils,as_forecast_quantile) | ||
| importFrom(scoringutils,bias_quantile) | ||
| importFrom(scoringutils,interval_coverage) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
scoringutils interval_coverage get_coverage prediction interval coverage
💡 Result:
Short answer — what these do:
-
interval_coverage(): for quantile-format forecasts, returns TRUE/FALSE per forecast indicating whether the observed value lies inside a central prediction interval of given range (e.g., the 50% interval formed by the 0.25 and 0.75 predictive quantiles). Usage: interval_coverage(observed, predicted, quantile_level, interval_range = 50). [1]
-
get_coverage(): works on a validated forecast object (as produced by as_forecast_quantile()/score()/summarise_scores()). It computes grouped summary coverage statistics: interval coverage (proportion of observations inside each central prediction interval), quantile coverage (proportion of observations below each predictive quantile), and coverage deviation (observed minus nominal coverage). Usage: get_coverage(forecast, by = "model"). [2]
Notes/links for examples and plotting: scoringutils also documents plotting helpers (plot_interval_coverage/plot_quantile_coverage) and gives worked examples showing how to compute scores → summarise_scores (include "range" or "quantile" in by) → get_coverage/plot. [3]
Sources:
[1] scoringutils interval_coverage reference.
[2] scoringutils get_coverage reference.
[3] scoringutils vignette (scoring rules / plotting interval coverage).
🏁 Script executed:
# Search for usage of get_coverage in the codebase
rg "get_coverage" --type RRepository: epiforecasts/evalvariantnowcasthub
Length of output: 103
🏁 Script executed:
# Search for usage of interval_coverage in the codebase
rg "interval_coverage" --type RRepository: epiforecasts/evalvariantnowcasthub
Length of output: 103
🏁 Script executed:
# Check the NAMESPACE file for imports from scoringutils
cat -n NAMESPACE | grep -A 2 -B 2 "scoringutils"Repository: epiforecasts/evalvariantnowcasthub
Length of output: 466
🏁 Script executed:
# Check compute_bias.R to see actual usage around line 116
fd "compute_bias.R" --type f -x cat -n {} | head -150 | tail -50Repository: epiforecasts/evalvariantnowcasthub
Length of output: 1798
🏁 Script executed:
# Check compute_bias.R more thoroughly
fd "compute_bias.R" --type f -x cat -n {}Repository: epiforecasts/evalvariantnowcasthub
Length of output: 5015
Correct the scoringutils function import in NAMESPACE.
The NAMESPACE file imports interval_coverage from scoringutils, but compute_coverage_25A() actually uses scoringutils::get_coverage() (line 116 in R/compute_bias.R). Additionally, the function's @importFrom directive (line 94) incorrectly lists score and interval_coverage, neither of which are used in this function.
Update NAMESPACE line 44 to import get_coverage instead of interval_coverage, and update the @importFrom directive at line 94 to reflect the actual functions used: as_forecast_quantile and get_coverage.
🤖 Prompt for AI Agents
In NAMESPACE around line 44 and the @importFrom directive around line 94, the
file incorrectly imports scoringutils::interval_coverage (line 44) and lists
score and interval_coverage in the @importFrom comment (line 94) even though
compute_coverage_25A() uses scoringutils::get_coverage() and the function needs
as_forecast_quantile; change the import on line 44 to
importFrom(scoringutils,get_coverage) and update the @importFrom directive at
line 94 to list as_forecast_quantile and get_coverage (removing score and
interval_coverage).
R/compute_bias.R
Outdated
| #' Compute prediction interval coverage using scoringutils | ||
| #' | ||
| #' @param df_prepared Prepared data in long format | ||
| #' @param locs Vector of location codes to include | ||
| #' @param nowcast_dates Vector of nowcast dates to include | ||
| #' | ||
| #' @returns Data frame with coverage for 50% and 95% intervals by model, | ||
| #' location, and nowcast_date | ||
| #' @importFrom dplyr filter group_by summarise mutate | ||
| #' @importFrom scoringutils as_forecast_quantile score interval_coverage | ||
| #' @autoglobal | ||
| compute_coverage_25A <- function(df_prepared, locs, nowcast_dates, | ||
| intervals = c(50, 95)) { |
There was a problem hiding this comment.
Add documentation for the intervals parameter.
The intervals parameter is missing from the roxygen documentation. The function signature includes intervals = c(50, 95) but there's no corresponding @param intervals tag.
Apply this diff to add the missing parameter documentation:
#' @param df_prepared Prepared data in long format
#' @param locs Vector of location codes to include
#' @param nowcast_dates Vector of nowcast dates to include
+#' @param intervals Numeric vector of interval ranges to include (default: c(50, 95))
#'
#' @returns Data frame with coverage for 50% and 95% intervals by model,
#' location, and nowcast_date📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #' Compute prediction interval coverage using scoringutils | |
| #' | |
| #' @param df_prepared Prepared data in long format | |
| #' @param locs Vector of location codes to include | |
| #' @param nowcast_dates Vector of nowcast dates to include | |
| #' | |
| #' @returns Data frame with coverage for 50% and 95% intervals by model, | |
| #' location, and nowcast_date | |
| #' @importFrom dplyr filter group_by summarise mutate | |
| #' @importFrom scoringutils as_forecast_quantile score interval_coverage | |
| #' @autoglobal | |
| compute_coverage_25A <- function(df_prepared, locs, nowcast_dates, | |
| intervals = c(50, 95)) { | |
| #' Compute prediction interval coverage using scoringutils | |
| #' | |
| #' @param df_prepared Prepared data in long format | |
| #' @param locs Vector of location codes to include | |
| #' @param nowcast_dates Vector of nowcast dates to include | |
| #' @param intervals Numeric vector of interval ranges to include (default: c(50, 95)) | |
| #' | |
| #' @returns Data frame with coverage for 50% and 95% intervals by model, | |
| #' location, and nowcast_date | |
| #' @importFrom dplyr filter group_by summarise mutate | |
| #' @importFrom scoringutils as_forecast_quantile score interval_coverage | |
| #' @autoglobal | |
| compute_coverage_25A <- function(df_prepared, locs, nowcast_dates, | |
| intervals = c(50, 95)) { |
🤖 Prompt for AI Agents
In R/compute_bias.R around lines 85 to 97, the roxygen block is missing
documentation for the function's intervals parameter; add a @param intervals tag
describing that it is a numeric vector of interval levels expressed as
percentages (e.g., c(50, 95)) and state the default value intervals = c(50, 95),
so users and generated help show the parameter purpose and default.
| #' Get a plot of prediction interval coverage summarized across nowcast dates | ||
| #' | ||
| #' @param coverage_data Data.frame of coverage scores with interval_range | ||
| #' @param locs Vector of character strings of locations | ||
| #' | ||
| #' @returns ggplot | ||
| #' @autoglobal |
There was a problem hiding this comment.
Roxygen parameter name mismatch.
The @param coverage_data (line 248) does not match the actual function parameter coverage (line 253). This causes incorrect documentation generation.
-#' @param coverage_data Data.frame of coverage scores with interval_range
+#' @param coverage Data.frame of coverage scores with interval_range🤖 Prompt for AI Agents
In R/fig_zoom_clade_mult_nowcasts.R around lines 246 to 252, the roxygen block
documents a parameter named coverage_data but the actual function parameter is
named coverage; update the roxygen @param entry to use the exact parameter name
"coverage" (and adjust its description if needed) so the documentation matches
the function signature and generates correctly.
| scale_fill_manual( | ||
| name = "Model", | ||
| values = plot_comps$model_colors | ||
| ) + | ||
| scale_alpha_manual( | ||
| name = "Interval coverage", | ||
| values = plot_comps$pred_int_alpha | ||
| ) + | ||
| guides( | ||
| fill = guide_legend( | ||
| title.position = "top", | ||
| title.hjust = 0.5, | ||
| nrow = 3 | ||
| ) | ||
| ) + | ||
| scale_fill_manual( | ||
| name = "Model", | ||
| values = plot_comps$model_colors | ||
| ) + | ||
| xlab("Model") + | ||
| ylab("Empirical\ncoverage") + | ||
| scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, 0.2)) |
There was a problem hiding this comment.
Duplicate scale_fill_manual call — second one overrides the first.
scale_fill_manual is called twice (lines 294-297 and 309-312). The second call completely overrides the first, meaning the guide_legend configuration from lines 302-308 is lost because it's associated with the fill aesthetic that gets replaced.
Remove the duplicate and keep only one scale_fill_manual with the desired configuration:
scale_alpha_manual(
name = "Interval coverage",
values = plot_comps$pred_int_alpha
) +
guides(
fill = guide_legend(
title.position = "top",
title.hjust = 0.5,
nrow = 3
)
) +
- scale_fill_manual(
- name = "Model",
- values = plot_comps$model_colors
- ) +
xlab("Model") +🤖 Prompt for AI Agents
In R/fig_zoom_clade_mult_nowcasts.R around lines 294-315 there is a duplicate
scale_fill_manual call that causes the second to override the first (losing the
guide_legend configuration); remove the redundant scale_fill_manual (keep a
single call) and ensure that the remaining scale_fill_manual includes the
desired name and values (plot_comps$model_colors) so the fill legend and the
guide_legend settings are preserved.
…com/epiforecasts/evalvariantnowcasthub into feature/add-coverage-panel-issue-51
This PR addresses #51 by enhancing the zoom_25A figure with the following changes:

Summary
✅ 1. Added Prediction Interval Coverage Panel (New Panel D)
scoringutils::interval_coverage()compute_coverage_25A()- computes interval coverage from prepared dataget_plot_coverage_by_date()- creates the coverage visualization✅ 2. Restored 25A Model Legend
get_plot_model_preds_mult()to show model legend in Panel A✅ 3. Made 25A Proportions Daily
get_plot_model_preds_mult()to use daily observationsdaily_to_weekly()aggregation✅ 4. Fixed X-Axis Alignment
ncol = 3to all faceted panels for consistent layout✅ 5. Updated Figure Layout
Files Modified
R/compute_bias.R- Addedcompute_coverage_25A()functionR/fig_zoom_clade_mult_nowcasts.R- Added coverage plot and updated figure layouttargets/fig_zoom_25A_targets.R- Added coverage computation targetsR/globals.R- Added global variables for new functionsNAMESPACE- Addedinterval_coverageimport from scoringutils.lintr- Fixed configuration issue with return_linterTesting
devtools::load_all()targets::tar_manifest())Closes #51
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.