-
Notifications
You must be signed in to change notification settings - Fork 1
Draft model description table #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughIntroduces a new documentation file containing a Markdown table that documents seven predictive models for COVID-19 variant nowcasting. The table includes details such as model names, descriptions, citations, data sources, locations, output types, and ensemble status for each model. Changes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Fix all issues with AI Agents 🤖
In @docs/model_description_table.md:
- Around line 3-4: Replace the placeholder "[citation]" entries for Hub-baseline
and UMass-HMLR with their full citation metadata: obtain the author list, title,
publication or preprint server, year, DOI/URL, and any version info from the
respective model owners/co-authors, then update the table rows for the
Hub-baseline and UMass-HMLR entries to include the formatted citation
(consistent with the project's citation style) and validate the links/DOIs;
notify the contributors if any details are missing and do not merge until both
citation fields are populated and link-checked.
- Line 8: The CADPH-CATaLog table row has a malformed Description cell mixing
citation/version notes and an unclosed quote; replace the Description content
for the CADPH-CATaLog entry so it contains only a concise model description
(e.g., "Fits a logistic growth function to estimate variant frequency over
time") and move any citation/version details back to the Citation or Notes
column if needed; update the CADPH-CATaLog row to match the format/style of the
CADPH-CATaMaran row so the Description column is clean and markdown is valid.
- Line 6: The table entry for LANL-CovTransformer has two issues: use "14-day
prediction" (hyphenate the compound adjective) and resolve the ensemble mismatch
between the description and the "Ensemble?" column; update either the
description for "LANL-CovTransformer" to clarify that the model is an internal
multi-stage ensemble (e.g., "an integrated multi-stage ensemble of 5 Stage 1
models used to produce a 14-day prediction") or change the "Ensemble?" column to
"Yes" if it should indicate internal ensemble status so the description and the
"Ensemble?" column are consistent.
🧹 Nitpick comments (1)
docs/model_description_table.md (1)
7-7: Minor wordiness improvement.The phrase "prior to the fitting period" is slightly wordy. Consider shortening to "before the fitting period" for improved readability.
🔎 Proposed fix
- We use similar parameters as our PANGO lineage Multinomial Logistic Regression on CalCAT, except that at any time 31 days of data are obscured prior to the fitting period to exclude noisy data being backfilled. + We use similar parameters as our PANGO lineage Multinomial Logistic Regression on CalCAT, except that at any time 31 days of data are obscured before the fitting period to exclude noisy data being backfilled.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/model_description_table.md
🧰 Additional context used
🪛 LanguageTool
docs/model_description_table.md
[uncategorized] ~6-~6: When a number forms part of an adjectival compound, use a hyphen.
Context: ...ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 mode...
(MISSING_HYPHEN)
[style] ~7-~7: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...t any time 31 days of data are obscured prior to the fitting period to exclude noisy dat...
(EN_WORDINESS_PREMIUM_PRIOR_TO)
🔇 Additional comments (2)
docs/model_description_table.md (2)
1-2: Table structure looks good.The markdown table is well-formatted with clear column headers and appropriate separators.
5-5: UGA-multicast entry is well-documented.Clear description, complete citation, and appropriate metadata. No concerns.
docs/model_description_table.md
Outdated
| | Hub-baseline | A Bayesian multinomial logistic regression model that makes predictions at the national level. This model uses a linear in logit space model for the growth of the variants and makes the same predictions for each state. | [citation] | NextStrain | All | Point and probabilistic | No | | ||
| | UMass-HMLR | A Bayesian hierarchical multinomial logistic regression (HMLR) model for nowcasting COVID variants. Regression coefficients are modeled hierarchically across variants and locations. | [citation] | NextStrain | All | Point and probabilistic | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing citations placeholders require community contribution.
Lines 3 and 4 contain [citation] placeholders for Hub-baseline and UMass-HMLR. As noted in the PR description, these are awaiting contributions from co-authors. Please ensure citation details are collected and merged before this documentation is published to users.
I can help format and validate citations once the details are provided. Would you like me to generate a template or checklist to help collect these from the respective teams?
🤖 Prompt for AI Agents
In @docs/model_description_table.md around lines 3-4, Replace the placeholder
"[citation]" entries for Hub-baseline and UMass-HMLR with their full citation
metadata: obtain the author list, title, publication or preprint server, year,
DOI/URL, and any version info from the respective model owners/co-authors, then
update the table rows for the Hub-baseline and UMass-HMLR entries to include the
formatted citation (consistent with the project's citation style) and validate
the links/DOIs; notify the contributors if any details are missing and do not
merge until both citation fields are populated and link-checked.
docs/model_description_table.md
Outdated
| | Hub-baseline | A Bayesian multinomial logistic regression model that makes predictions at the national level. This model uses a linear in logit space model for the growth of the variants and makes the same predictions for each state. | [citation] | NextStrain | All | Point and probabilistic | No | | ||
| | UMass-HMLR | A Bayesian hierarchical multinomial logistic regression (HMLR) model for nowcasting COVID variants. Regression coefficients are modeled hierarchically across variants and locations. | [citation] | NextStrain | All | Point and probabilistic | No | | ||
| | UGA-multicast | A multinomial logistic regression model with no fancy features for nowcasting COVID-19 variants. | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain | All | Probabilistic | No | | ||
| | LANL-CovTransformer | CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 models). | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain for inference, GISAID for training | All | Point | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n docs/model_description_table.mdRepository: epiforecasts/evalvariantnowcasthub
Length of output: 2932
Fix hyphenation and clarify ensemble terminology.
Change "14 day prediction" to "14-day prediction" (compound adjectives should be hyphenated).
The description states "an integrated ensemble of models" whilst the Ensemble? column shows "No." Clarify whether this refers to the model's internal structure (ensemble of stages) or its participation in the Nowcast Hub ensemble, and update the description or column accordingly for consistency.
🔎 Proposed hyphenation fix
- CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 models).
+ CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14-day prediction using 5 models (Stage 1 models).📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | LANL-CovTransformer | CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 models). | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain for inference, GISAID for training | All | Point | No | | |
| | LANL-CovTransformer | CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14-day prediction using 5 models (Stage 1 models). | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain for inference, GISAID for training | All | Point | No | |
🧰 Tools
🪛 LanguageTool
[uncategorized] ~6-~6: When a number forms part of an adjectival compound, use a hyphen.
Context: ...ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 mode...
(MISSING_HYPHEN)
🤖 Prompt for AI Agents
In @docs/model_description_table.md around line 6, The table entry for
LANL-CovTransformer has two issues: use "14-day prediction" (hyphenate the
compound adjective) and resolve the ensemble mismatch between the description
and the "Ensemble?" column; update either the description for
"LANL-CovTransformer" to clarify that the model is an internal multi-stage
ensemble (e.g., "an integrated multi-stage ensemble of 5 Stage 1 models used to
produce a 14-day prediction") or change the "Ensemble?" column to "Yes" if it
should indicate internal ensemble status so the description and the "Ensemble?"
column are consistent.
docs/model_description_table.md
Outdated
| | UGA-multicast | A multinomial logistic regression model with no fancy features for nowcasting COVID-19 variants. | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain | All | Probabilistic | No | | ||
| | LANL-CovTransformer | CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 models). | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain for inference, GISAID for training | All | Point | No | | ||
| | CADPH-CATaMaran | We use similar parameters as our PANGO lineage Multinomial Logistic Regression on CalCAT, except that at any time 31 days of data are obscured prior to the fitting period to exclude noisy data being backfilled. This model can accommodate resampling eventually. | Wadford et al. Implementation of California COVIDNet - a multi-sector collaboration for statewide SARS-CoV-2 genomic surveillance. Front Public Health. 2023 Oct 23;11:1249614. doi: 10.3389/fpubh.2023.1249614. PMID: 37937074; PMCID: PMC10627185. | Theaigen Genomics | All | Point | No | | ||
| | CADPH-CATaLog | Fits a logistic growth function to data. Based on Althaus, Christian L., et al. "A Tale of Two Variants: Spread of SARS-CoV-2 Variants Alpha in Geneva, Switzerland, and Beta in South Africa. 1.1 Adds filtering criteria/ GISAID data addition | Wadford et al. Implementation of California COVIDNet - a multi-sector collaboration for statewide SARS-CoV-2 genomic surveillance. Front Public Health. 2023 Oct 23;11:1249614. doi: 10.3389/fpubh.2023.1249614. PMID: 37937074; PMCID: PMC10627185. | Theaigen Genomics | All | Point | No | No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the file
find . -name "model_description_table.md" -type fRepository: epiforecasts/evalvariantnowcasthub
Length of output: 111
🏁 Script executed:
# Read the file with line numbers to examine the content around line 8
cat -n docs/model_description_table.md | head -20Repository: epiforecasts/evalvariantnowcasthub
Length of output: 2932
Fix the malformed description in the CADPH-CATaLog entry.
The description field contains an unclosed quotation mark and mixes model description with citation text and version notes, resulting in corrupted markdown. The text should provide only a coherent description of the model, separate from the Citation column which already contains the reference.
Correct the entry to follow the format of other rows (e.g. CADPH-CATaMaran on the preceding line), with a clear description in the Description column only.
🤖 Prompt for AI Agents
In @docs/model_description_table.md around line 8, The CADPH-CATaLog table row
has a malformed Description cell mixing citation/version notes and an unclosed
quote; replace the Description content for the CADPH-CATaLog entry so it
contains only a concise model description (e.g., "Fits a logistic growth
function to estimate variant frequency over time") and move any citation/version
details back to the Citation or Notes column if needed; update the CADPH-CATaLog
row to match the format/style of the CADPH-CATaMaran row so the Description
column is clean and markdown is valid.
First pass from the Variant Nowcast Hub's metadata files. Missing citations for the ones that don't have it -- thinking of just putting a repo url?
Co-authors:
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.