Issue 14: Use percent ED visits due to COVID data so we have data for whole timerange #61

kaitejohnson · 2026-01-05T12:06:36Z

Per Dylan's suggestion, swapping out the NHSN signal for the percent ED visits due to COVID from NSSP data. I obtained this from downloading the data here

Note:

I added the data to the repo so it is fully reproducible, as I couldn't figure out how to get an API link to the csv file from data.cdc.gov

Summary by CodeRabbit

Release Notes

New Features
- Redesigned hospital visualisation to display the percentage of emergency department visits due to COVID-19 over time, replacing the previous hospital admissions metric.
- Updated chart format from bar to line graph for improved temporal trend visualisation.
Documentation
- Updated function documentation to reflect revised data source parameters and visualisation outputs.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-05T12:06:47Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Updates hospital admissions plotting function to use local CSV file instead of URL, changes metric from hospital admissions to percent of ED visits due to COVID, modifies associated global variables, updates coverage plotting function documentation, and adjusts target data subset from US minus California to full US.

Changes

Cohort / File(s)	Summary
Core plotting function update `R/fig_data.R`	Added `data_fp` parameter with default path to local CSV file; updated data loading to read from CSV and join with location_data; changed plot metric from value (hospital admissions) to percent_visits_covid; changed plot type from bar to line; updated axis label and added location title to plot
Global variables `R/globals.R`	Added `week_end` and `percent_visits_covid` to `get_plot_hosp_admissions` group; removed `value` from same group; reassigned coverage-related variables (`model_id`, `location`, `interval_range`, `interval_coverage`, `empirical_coverage`, `interval_label`) from `get_plot_coverage_overall` to `get_plot_coverage_by_loc` group with updated interval thresholds
Hospital admissions documentation `man/get_plot_hosp_admissions.Rd`	Updated title and description to reflect "Percent of hospital admissions due to COVID over time"; replaced `url` parameter with `data_fp` parameter; added data source provenance information
Coverage plotting documentation `man/get_plot_coverage_overall.Rd`	Removed `locs` parameter from function signature; updated title and description to include "and locations"; updated file reference to `R/fig_overall_scores.R`
New coverage function documentation `man/get_plot_coverage_by_loc.Rd`	New roxygen2 documentation file for `get_plot_coverage_by_loc` function with parameter and description details
Target configuration `targets/fig_data_targets.R`	Changed `plot_hosp_admissions_usminca` target to use `"US"` location instead of `"US minus CA"`

Possibly related PRs

Changes to first data figure #30: Modifies get_plot_hosp_admissions function in R/fig_data.R affecting data handling and plot composition.
Issue 27: Nowcasts with data figure #28: Modifies R/globals.R updating global variable assignments affecting the same function groups.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: replacing hospital admissions data with percent ED visits due to COVID data to extend the timerange.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Fix all issues with AI Agents 🤖

In @man/get_plot_hosp_admissions.Rd:
- Line 5: The Rd documentation incorrectly uses "hospital admissions" instead of
"ED visits"; update the \title and the description field (the text around line
38) in get_plot_hosp_admissions.Rd to say "ED visits" or "emergency department
(ED) visits" consistently, and adjust any phrasing that implies hospital
admissions (e.g., change "Percent of hospital admissions due to covid over time"
to "Percent of ED visits due to COVID-19 over time") so the wording matches the
data source and implementation.

In @R/fig_data.R:
- Line 201: Update the roxygen title string that currently reads "Percent of
hospital admissions due to covid over time" to use correct terminology "Percent
of ED visits due to COVID-19 over time" (match capitalization if desired) and
ensure any other metadata/title variables in the same block are consistent with
the y-axis label and NSSP source; specifically, edit the roxygen comment line
starting with "#' Percent of hospital admissions due to covid over time" and
verify there are no other occurrences of "hospital admissions" in this help/plot
block.
- Around line 226-236: The left_join between nssp_cdcgov and location_data is
using the wrong key; change the join in the pipeline that produces raw_data (the
left_join call involving nssp_cdcgov and location_data) to match geography to
the FluSight `location` column instead of `location_name` so rows with FIPS
codes align correctly (i.e., use by = c("geography" = "location") in the
left_join definition).

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

targets/fig_data_targets.R (1)
77-85: Critical: Inconsistent location selection for US minus CA figure.

The target plot_hosp_admissions_usminca has been changed to plot location_to_plot = "US", but:

The target name suffix _usminca implies "US minus CA"

It's grouped with other US minus CA targets (lines 50-97)

It's used in first_data_fig_alt (line 91) as plot_hosp1, which should represent US minus CA data to maintain consistency with plot_freq1 and plot_seq1

Unlike sequence data (which can be aggregated excluding CA via obs_data_us_min_ca), the NSSP data is loaded by state-level location_to_plot. This means the hospital admissions plot will show full US data whilst the other two panels show US minus CA data, creating an inconsistent comparison.
Possible solutions

Option 1: If NSSP data includes state-level data, filter to exclude CA:
# Would require changes to get_plot_hosp_admissions to accept excluded states
# or pre-filter the data in the target
Option 2: If the intention is to show full US for all panels, rename the target and update the figure assembly logic to reflect that this is no longer a US minus CA comparison.

Option 3: If NSSP data doesn't support state exclusion, document this limitation clearly in comments and consider whether this comparison figure should still be generated.

🧹 Nitpick comments (1)

R/fig_data.R (1)
246-246: Remove commented-out code.

The commented line # scale_y_continuous(transform = "log10")+ should be removed rather than left in the codebase.
🔎 Proposed fix
     ylab("Percent of ED visits\ndue to COVID") +
-    # scale_y_continuous(transform = "log10")+
     scale_x_date(

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac058fa and 706ec14.

⛔ Files ignored due to path filters (1)

input/nssp_states_and_national.csv is excluded by !**/*.csv

📒 Files selected for processing (6)

R/fig_data.R
R/globals.R
man/get_plot_coverage_by_loc.Rd
man/get_plot_coverage_overall.Rd
man/get_plot_hosp_admissions.Rd
targets/fig_data_targets.R

🔇 Additional comments (5)

man/get_plot_coverage_by_loc.Rd (1)

1-19: LGTM!

The documentation for get_plot_coverage_by_loc is well-formed and complete.

R/globals.R (2)

72-74: LGTM!

The new global variables week_end and percent_visits_covid correctly correspond to the NSSP data columns used in the updated get_plot_hosp_admissions function.

109-116: LGTM!

The reorganisation of coverage-related global variables between get_plot_coverage_overall and get_plot_coverage_by_loc appears correct and aligns with the updated plotting function scopes.

Also applies to: 149-156

man/get_plot_coverage_overall.Rd (2)

2-2: Documentation source updated appropriately.

The function documentation now references R/fig_overall_scores.R, which appears to be a more semantically appropriate location for this overall coverage plotting function.

5-6: Documentation updated consistently for parameter removal.

The documentation correctly reflects the removal of the locs parameter from the function signature. The title and description now clarify that the plot summarises across both nowcast dates and locations, which aligns well with the "overall" nature of the function.

Verification confirms that all call sites throughout the codebase have been updated to use the new single-parameter signature:

targets/fig_overall_targets.R:61 – get_plot_coverage_overall(coverage_summarised_ep)

targets/fig_overall_targets.R:136 – get_plot_coverage_overall(coverage_summarised_ca)

The breaking change has been properly propagated.

coderabbitai · 2026-01-05T13:08:06Z

man/get_plot_hosp_admissions.Rd

 \name{get_plot_hosp_admissions}
 \alias{get_plot_hosp_admissions}
-\title{Hospital admissions over time}
+\title{Percent of hospital admissions due to covid over time}


⚠️ Potential issue | 🟡 Minor

Correct terminology: ED visits, not hospital admissions.

Both the title (line 5) and description (line 38) incorrectly state "hospital admissions" when the data source and implementation use "ED visits" (emergency department visits).

🔎 Proposed fix

-\title{Percent of hospital admissions due to covid over time} +\title{Percent of ED visits due to COVID over time}

-Percent of hospital admissions due to covid over time +Percent of ED visits due to COVID over time

Also applies to: 38-38

🤖 Prompt for AI Agents

In @man/get_plot_hosp_admissions.Rd around line 5, The Rd documentation incorrectly uses "hospital admissions" instead of "ED visits"; update the \title and the description field (the text around line 38) in get_plot_hosp_admissions.Rd to say "ED visits" or "emergency department (ED) visits" consistently, and adjust any phrasing that implies hospital admissions (e.g., change "Percent of hospital admissions due to covid over time" to "Percent of ED visits due to COVID-19 over time") so the wording matches the data source and implementation.

coderabbitai · 2026-01-05T13:08:06Z

R/fig_data.R

 }

-#' Hospital admissions over time
+#' Percent of hospital admissions due to covid over time


⚠️ Potential issue | 🟡 Minor

Correct terminology: ED visits, not hospital admissions.

The title states "Percent of hospital admissions due to covid" but the data source (NSSP) and y-axis label (line 245) correctly reference "ED visits" (emergency department visits). Hospital admissions and ED visits are distinct metrics.

🔎 Proposed fix

-#' Percent of hospital admissions due to covid over time +#' Percent of ED visits due to COVID over time

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#' Percent of hospital admissions due to covid over time

#' Percent of ED visits due to COVID over time

🤖 Prompt for AI Agents

In @R/fig_data.R around line 201, Update the roxygen title string that currently reads "Percent of hospital admissions due to covid over time" to use correct terminology "Percent of ED visits due to COVID-19 over time" (match capitalization if desired) and ensure any other metadata/title variables in the same block are consistent with the y-axis label and NSSP source; specifically, edit the roxygen comment line starting with "#' Percent of hospital admissions due to covid over time" and verify there are no other occurrences of "hospital admissions" in this help/plot block.

coderabbitai · 2026-01-05T13:08:06Z

R/fig_data.R

+  nssp_cdcgov <- read_csv(data_fp)
+  raw_data <- nssp_cdcgov |>
+    left_join(location_data,
+      by = c("geography" = "location_name") # nolint
+    ) |>
+    rename(date = week_end) |>
+    filter(
+      abbreviation == !!location_to_plot,
+      date >= min(date_range),
+      date <= max(date_range)
+    )


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the NSSP CSV schema and check for a sample of join key values if [ -f "input/nssp_states_and_national.csv" ]; then echo "=== CSV column names ===" head -1 input/nssp_states_and_national.csv echo "" echo "=== Sample geography values (first 10 unique) ===" tail -n +2 input/nssp_states_and_national.csv | cut -d',' -f1 | sort -u | head -10 else echo "NSSP CSV file not found at input/nssp_states_and_national.csv" echo "" echo "Searching for NSSP-related CSV files in the repository..." find . -name "*nssp*.csv" -o -name "*states*.csv" | head -20 fi

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 608

🏁 Script executed:

#!/bin/bash # Find where location_data is created/loaded in the R file grep -n "location_data" R/fig_data.R | head -20

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 248

🏁 Script executed:

#!/bin/bash # View the function signature and parameter documentation sed -n '200,240p' R/fig_data.R

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 1772

🏁 Script executed:

#!/bin/bash # Find where get_plot_hosp_admissions is called echo "=== Function calls to get_plot_hosp_admissions ===" rg "get_plot_hosp_admissions" --type r -B 2 -A 5 echo "" echo "=== Search for location_data creation/loading ===" rg "location_data\s*<-|location_data\s*=" --type r -B 2 -A 2 | head -50

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 5289

🏁 Script executed:

#!/bin/bash # Find the tar_target definition for location_data rg "tar_target.*location_data|name\s*=\s*location_data" --type r -B 5 -A 10

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 928

🏁 Script executed:

#!/bin/bash # Find location_fp definition echo "=== Finding location_fp ===" rg "location_fp\s*=" --type r -B 2 -A 2 echo "" echo "=== Search for location CSV files ===" find . -name "*location*" -type f | grep -E "\.(csv|tsv)$" | head -20

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 454

🏁 Script executed:

#!/bin/bash # Search for location_fp assignment echo "=== Finding location_fp assignment ===" rg "location_fp.*<-|location_fp\s*=" --type r -B 1 -A 1 echo "" echo "=== Check targets config file for location_fp ===" grep -n "location_fp" targets/config_targets.R | head -20

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 430

🏁 Script executed:

#!/bin/bash # View the tar_target definition for location_fp sed -n '30,45p' targets/config_targets.R

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 849

🏁 Script executed:

#!/bin/bash # Fetch and inspect the FluSight locations CSV echo "=== Fetching FluSight locations CSV ===" curl -s "https://raw.githubusercontent.com/cdcepi/FluSight-forecast-hub/refs/heads/main/auxiliary-data/locations.csv" | head -5 > /tmp/locations.csv if [ -s /tmp/locations.csv ]; then echo "=== Column names ===" head -1 /tmp/locations.csv echo "" echo "=== First few rows ===" head -6 /tmp/locations.csv else echo "Failed to fetch locations CSV" fi

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 742

Correct the join key: use location instead of location_name.

The NSSP CSV's geography column contains numeric FIPS codes (e.g., "1", "10", "100", "1000"), which correspond to the FluSight locations CSV's location column (e.g., "01", "02", "04"), not location_name (which contains place names like "Alabama", "Alaska"). The current join will fail to match any rows.

Change:

by = c("geography" = "location_name")

To:

by = c("geography" = "location")

🤖 Prompt for AI Agents

In @R/fig_data.R around lines 226-236, The left_join between nssp_cdcgov and location_data is using the wrong key; change the join in the pipeline that produces raw_data (the left_join call involving nssp_cdcgov and location_data) to match geography to the FluSight `location` column instead of `location_name` so rows with FIPS codes align correctly (i.e., use by = c("geography" = "location") in the left_join definition).

…s/evalvariantnowcasthub into 14-use-pct-ed-visits

Kaitlyn Johnson added 4 commits January 5, 2026 12:01

add revised function + data

75c662e

add link to original nssp data

2aec003

add link to original nssp data

3de8f80

wrong fp

a7080f7

fix linting

491dd00

kaitejohnson marked this pull request as ready for review January 5, 2026 13:02

Merge branch 'main' into 14-use-pct-ed-visits

706ec14

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

Kaitlyn Johnson added 2 commits January 5, 2026 13:24

fix title

463e688

Merge branch '14-use-pct-ed-visits' of https://github.com/epiforecast…

45101c3

…s/evalvariantnowcasthub into 14-use-pct-ed-visits

kaitejohnson merged commit 615c782 into main Jan 5, 2026
2 checks passed

kaitejohnson deleted the 14-use-pct-ed-visits branch January 5, 2026 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue 14: Use percent ED visits due to COVID data so we have data for whole timerange #61

Issue 14: Use percent ED visits due to COVID data so we have data for whole timerange #61

Uh oh!

kaitejohnson commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 5, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 5, 2026

Uh oh!

coderabbitai bot Jan 5, 2026

Uh oh!

coderabbitai bot Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	#' Percent of hospital admissions due to covid over time
	#' Percent of ED visits due to COVID over time

Issue 14: Use percent ED visits due to COVID data so we have data for whole timerange #61

Issue 14: Use percent ED visits due to COVID data so we have data for whole timerange #61

Uh oh!

Conversation

kaitejohnson commented Jan 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kaitejohnson commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 5, 2026 •

edited

Loading