Draft model description table #62

kaitejohnson · 2026-01-05T16:36:36Z

First pass from the Variant Nowcast Hub's metadata files. Missing citations for the ones that don't have it -- thinking of just putting a repo url?

Co-authors:

Feel free to review and edit any columns in your team's row. You could either make a suggestion on this PR or you can send me an edited markdown and I will merge everything.

Summary by CodeRabbit

Documentation
- Added reference documentation table for seven COVID-19 variant nowcasting predictive models, including descriptions, citations, data sources, geographic coverage, output types (point and probabilistic forecasts), and ensemble classification to support model evaluation and selection.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-05T16:36:42Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Introduces a new documentation file containing a Markdown table that documents seven predictive models for COVID-19 variant nowcasting. The table includes details such as model names, descriptions, citations, data sources, locations, output types, and ensemble status for each model.

Changes

Cohort / File(s)	Summary of changes
Documentation `docs/model_description_table.md`	Added Markdown table documenting seven COVID-19 variant nowcasting models with columns for model name, description, citation, data sources, locations, output type, and ensemble status

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Draft model description table' clearly and concisely describes the main change—introducing a new documentation file with a table describing predictive models, which aligns with the changeset content.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch model-description-table

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Fix all issues with AI Agents 🤖

In @docs/model_description_table.md:
- Around line 3-4: Replace the placeholder "[citation]" entries for Hub-baseline
and UMass-HMLR with their full citation metadata: obtain the author list, title,
publication or preprint server, year, DOI/URL, and any version info from the
respective model owners/co-authors, then update the table rows for the
Hub-baseline and UMass-HMLR entries to include the formatted citation
(consistent with the project's citation style) and validate the links/DOIs;
notify the contributors if any details are missing and do not merge until both
citation fields are populated and link-checked.
- Line 8: The CADPH-CATaLog table row has a malformed Description cell mixing
citation/version notes and an unclosed quote; replace the Description content
for the CADPH-CATaLog entry so it contains only a concise model description
(e.g., "Fits a logistic growth function to estimate variant frequency over
time") and move any citation/version details back to the Citation or Notes
column if needed; update the CADPH-CATaLog row to match the format/style of the
CADPH-CATaMaran row so the Description column is clean and markdown is valid.
- Line 6: The table entry for LANL-CovTransformer has two issues: use "14-day
prediction" (hyphenate the compound adjective) and resolve the ensemble mismatch
between the description and the "Ensemble?" column; update either the
description for "LANL-CovTransformer" to clarify that the model is an internal
multi-stage ensemble (e.g., "an integrated multi-stage ensemble of 5 Stage 1
models used to produce a 14-day prediction") or change the "Ensemble?" column to
"Yes" if it should indicate internal ensemble status so the description and the
"Ensemble?" column are consistent.

🧹 Nitpick comments (1)

docs/model_description_table.md (1)
7-7: Minor wordiness improvement.

The phrase "prior to the fitting period" is slightly wordy. Consider shortening to "before the fitting period" for improved readability.
🔎 Proposed fix
- We use similar parameters as our PANGO lineage Multinomial Logistic Regression on CalCAT, except that at any time 31 days of data are obscured prior to the fitting period to exclude noisy data being backfilled.
+ We use similar parameters as our PANGO lineage Multinomial Logistic Regression on CalCAT, except that at any time 31 days of data are obscured before the fitting period to exclude noisy data being backfilled.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 615c782 and 3fa98e0.

📒 Files selected for processing (1)

docs/model_description_table.md

🧰 Additional context used

🪛 LanguageTool

docs/model_description_table.md

[uncategorized] ~6-~6: When a number forms part of an adjectival compound, use a hyphen.
Context: ...ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 mode...

(MISSING_HYPHEN)

[style] ~7-~7: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...t any time 31 days of data are obscured prior to the fitting period to exclude noisy dat...

(EN_WORDINESS_PREMIUM_PRIOR_TO)

🔇 Additional comments (2)

docs/model_description_table.md (2)

1-2: Table structure looks good.

The markdown table is well-formatted with clear column headers and appropriate separators.

5-5: UGA-multicast entry is well-documented.

Clear description, complete citation, and appropriate metadata. No concerns.

coderabbitai · 2026-01-05T16:41:46Z

docs/model_description_table.md

+| Hub-baseline | A Bayesian multinomial logistic regression model that makes predictions at the national level. This model uses a linear in logit space model for the growth of the variants and makes the same predictions for each state. | [citation] | NextStrain | All | Point and probabilistic | No |
+| UMass-HMLR | A Bayesian hierarchical multinomial logistic regression (HMLR) model for nowcasting COVID variants. Regression coefficients are modeled hierarchically across variants and locations. | [citation] | NextStrain | All | Point and probabilistic | No |


⚠️ Potential issue | 🟠 Major

Missing citations placeholders require community contribution.

Lines 3 and 4 contain [citation] placeholders for Hub-baseline and UMass-HMLR. As noted in the PR description, these are awaiting contributions from co-authors. Please ensure citation details are collected and merged before this documentation is published to users.

I can help format and validate citations once the details are provided. Would you like me to generate a template or checklist to help collect these from the respective teams?

🤖 Prompt for AI Agents

In @docs/model_description_table.md around lines 3-4, Replace the placeholder "[citation]" entries for Hub-baseline and UMass-HMLR with their full citation metadata: obtain the author list, title, publication or preprint server, year, DOI/URL, and any version info from the respective model owners/co-authors, then update the table rows for the Hub-baseline and UMass-HMLR entries to include the formatted citation (consistent with the project's citation style) and validate the links/DOIs; notify the contributors if any details are missing and do not merge until both citation fields are populated and link-checked.

coderabbitai · 2026-01-05T16:41:46Z

docs/model_description_table.md

+| Hub-baseline | A Bayesian multinomial logistic regression model that makes predictions at the national level. This model uses a linear in logit space model for the growth of the variants and makes the same predictions for each state. | [citation] | NextStrain | All | Point and probabilistic | No |
+| UMass-HMLR | A Bayesian hierarchical multinomial logistic regression (HMLR) model for nowcasting COVID variants. Regression coefficients are modeled hierarchically across variants and locations. | [citation] | NextStrain | All | Point and probabilistic | No |
+| UGA-multicast | A multinomial logistic regression model with no fancy features for nowcasting COVID-19 variants. | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain | All | Probabilistic | No |
+| LANL-CovTransformer | CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 models). | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain for inference, GISAID for training | All | Point | No |


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n docs/model_description_table.md

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 2932

Fix hyphenation and clarify ensemble terminology.

Change "14 day prediction" to "14-day prediction" (compound adjectives should be hyphenated).

The description states "an integrated ensemble of models" whilst the Ensemble? column shows "No." Clarify whether this refers to the model's internal structure (ensemble of stages) or its participation in the Nowcast Hub ensemble, and update the description or column accordingly for consistency.

🔎 Proposed hyphenation fix

- CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 models). + CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14-day prediction using 5 models (Stage 1 models).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

| LANL-CovTransformer | CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 models). | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain for inference, GISAID for training | All | Point | No |

| LANL-CovTransformer | CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14-day prediction using 5 models (Stage 1 models). | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain for inference, GISAID for training | All | Point | No |

🧰 Tools

🪛 LanguageTool

[uncategorized] ~6-~6: When a number forms part of an adjectival compound, use a hyphen.
Context: ...ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 mode...

(MISSING_HYPHEN)

🤖 Prompt for AI Agents

In @docs/model_description_table.md around line 6, The table entry for LANL-CovTransformer has two issues: use "14-day prediction" (hyphenate the compound adjective) and resolve the ensemble mismatch between the description and the "Ensemble?" column; update either the description for "LANL-CovTransformer" to clarify that the model is an internal multi-stage ensemble (e.g., "an integrated multi-stage ensemble of 5 Stage 1 models used to produce a 14-day prediction") or change the "Ensemble?" column to "Yes" if it should indicate internal ensemble status so the description and the "Ensemble?" column are consistent.

coderabbitai · 2026-01-05T16:41:46Z

docs/model_description_table.md

+| UGA-multicast | A multinomial logistic regression model with no fancy features for nowcasting COVID-19 variants. | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain | All | Probabilistic | No |
+| LANL-CovTransformer | CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 models). | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain for inference, GISAID for training | All | Point | No |
+| CADPH-CATaMaran | We use similar parameters as our PANGO lineage Multinomial Logistic Regression on CalCAT, except that at any time 31 days of data are obscured prior to the fitting period to exclude noisy data being backfilled. This model can accommodate resampling eventually. | Wadford et al. Implementation of California COVIDNet - a multi-sector collaboration for statewide SARS-CoV-2 genomic surveillance. Front Public Health. 2023 Oct 23;11:1249614. doi: 10.3389/fpubh.2023.1249614. PMID: 37937074; PMCID: PMC10627185. | Theaigen Genomics | All | Point | No |
+| CADPH-CATaLog | Fits a logistic growth function to data. Based on Althaus, Christian L., et al. "A Tale of Two Variants: Spread of SARS-CoV-2 Variants Alpha in Geneva, Switzerland, and Beta in South Africa. 1.1 Adds filtering criteria/ GISAID data addition | Wadford et al. Implementation of California COVIDNet - a multi-sector collaboration for statewide SARS-CoV-2 genomic surveillance. Front Public Health. 2023 Oct 23;11:1249614. doi: 10.3389/fpubh.2023.1249614. PMID: 37937074; PMCID: PMC10627185. | Theaigen Genomics | All | Point | No |


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the file find . -name "model_description_table.md" -type f

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 111

🏁 Script executed:

# Read the file with line numbers to examine the content around line 8 cat -n docs/model_description_table.md | head -20

Repository: epiforecasts/evalvariantnowcasthub

Length of output: 2932

Fix the malformed description in the CADPH-CATaLog entry.

The description field contains an unclosed quotation mark and mixes model description with citation text and version notes, resulting in corrupted markdown. The text should provide only a coherent description of the model, separate from the Citation column which already contains the reference.

Correct the entry to follow the format of other rows (e.g. CADPH-CATaMaran on the preceding line), with a clear description in the Description column only.

🤖 Prompt for AI Agents

In @docs/model_description_table.md around line 8, The CADPH-CATaLog table row has a malformed Description cell mixing citation/version notes and an unclosed quote; replace the Description content for the CADPH-CATaLog entry so it contains only a concise model description (e.g., "Fits a logistic growth function to estimate variant frequency over time") and move any citation/version details back to the Citation or Notes column if needed; update the CADPH-CATaLog row to match the format/style of the CADPH-CATaMaran row so the Description column is clean and markdown is valid.

draft model description table

3fa98e0

kaitejohnson marked this pull request as ready for review January 5, 2026 16:39

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

edit and add model description table to supplement

35e8532

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Draft model description table #62

Draft model description table #62

Uh oh!

kaitejohnson commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 5, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 5, 2026

Uh oh!

coderabbitai bot Jan 5, 2026

Uh oh!

coderabbitai bot Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		\| Hub-baseline \| A Bayesian multinomial logistic regression model that makes predictions at the national level. This model uses a linear in logit space model for the growth of the variants and makes the same predictions for each state. \| [citation] \| NextStrain \| All \| Point and probabilistic \| No \|
		\| UMass-HMLR \| A Bayesian hierarchical multinomial logistic regression (HMLR) model for nowcasting COVID variants. Regression coefficients are modeled hierarchically across variants and locations. \| [citation] \| NextStrain \| All \| Point and probabilistic \| No \|

Draft model description table #62

Are you sure you want to change the base?

Draft model description table #62

Uh oh!

Conversation

kaitejohnson commented Jan 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kaitejohnson commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 5, 2026 •

edited

Loading