Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions docs/model_description_table.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+-----------+-------------------------+-----------+
| Model name | Description | Citation | Data Sources | Locations | Output Type | Ensemble? |
+=====================+===========================================================================================================================================================================================================================================================================================================+======================================================================================================================================================================================================================================================+===============================================+===========+=========================+===========+
| Hub-baseline | A Bayesian multinomial logistic regression model that makes predictions at the national level. This model uses a linear in logit space model for the growth of the variants and makes the same predictions for each state. | [citation] | NextStrain | All | Point and probabilistic | No |
+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+-----------+-------------------------+-----------+
| UMass-HMLR | A Bayesian hierarchical multinomial logistic regression (HMLR) model for nowcasting COVID variants. Regression coefficients are modeled hierarchically across variants and locations. | [citation] | NextStrain | All | Point and probabilistic | No |
+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+-----------+-------------------------+-----------+
| UGA-multicast | A multinomial logistic regression model with no fancy features for nowcasting COVID-19 variants. | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain | All | Probabilistic | No |
+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+-----------+-------------------------+-----------+
| LANL-CovTransformer | CovTransformer is a streamlined single-layer transformer architecture augmented with linear input and output layers, using embedding dimensions of 8 and dual attention heads. CovTransformer is an integrated ensemble of models, which first makes a 14 day prediction using 5 models (Stage 1 models). | Feng, Y., Goldberg, E. E., Kupperman, M., Zhang, X., Lin, Y., and Ke, R. (2024). CovTransformer: A transformer model for SARS-CoV-2 lineage frequency forecasting. Virus Evolution, to appear. | NextStrain for inference, GISAID for training | All | Point | No |
+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+-----------+-------------------------+-----------+
| CADPH-CATaMaran | We use similar parameters as our PANGO lineage Multinomial Logistic Regression on CalCAT, except that at any time 31 days of data are obscured prior to the fitting period to exclude noisy data being backfilled. This model can accommodate resampling eventually. | Wadford et al. Implementation of California COVIDNet - a multi-sector collaboration for statewide SARS-CoV-2 genomic surveillance. Front Public Health. 2023 Oct 23;11:1249614. doi: 10.3389/fpubh.2023.1249614. PMID: 37937074; PMCID: PMC10627185. | Theaigen Genomics | All | Point | No |
+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+-----------+-------------------------+-----------+
| CADPH-CATaLog | Fits a logistic growth function to data. Adds filtering criteria/ GISAID data addition | Wadford et al. Implementation of California COVIDNet - a multi-sector collaboration for statewide SARS-CoV-2 genomic surveillance. Front Public Health. 2023 Oct 23;11:1249614. doi: 10.3389/fpubh.2023.1249614. PMID: 37937074; PMCID: PMC10627185. | Theaigen Genomics | All | Point | No |
| | | | | | | |
| | | Model based on Althaus, Christian L., et al. "A Tale of Two Variants: Spread of SARS-CoV-2 Variants Alpha in Geneva, Switzerland, and Beta in South Africa. 1.1 | | | | |
+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+-----------+-------------------------+-----------+
13 changes: 12 additions & 1 deletion docs/supplement.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,19 @@ output:

![Fig. S9 Evaluation sequence counts by location and nowcast date. Color indicates the total number of sequences available in the evaluation data (log scale) for each location and nowcast date combination. Gray tiles indicate nowcast date-location pairs with no sequences across all horizons.](../output/figs/metadata/supp/eval_sequence_counts_heatmap.png)

## Additional results
# Additional results

![Fig. S10 Absolute Brier (top) and energy (bottom) in the U.S. excluding California (left) and California (right).](../output/figs/overall_scores/supp/absolute_scores_by_horizon.png)

![Fig. S11 Bias over time for three example states in the US during the 25A emergence.](../output/figs/zoom_25A/supp/bias_over_time_25A.png)

# Additional models not evaluated in this work

The following models began submissions following the initial assessment period (October 9th, 2024 to June 4th, 2025).

| Model name | Description | Citation | Data Sources | Locations | Output Type | Ensemble? |
|----|----|----|----|----|----|----|
| open_hier_mlr | A Bayesian hierarchical multinomial logistic regression (MLR) model for nowcasting COVID variants using variant counts based on GISAID sequences. Regression coefficients are modeled hierarchically across locations. | Abousamra E, Figgins M, Bedford T (2024) Fitness models provide accurate short-term forecasts of SARS-CoV-2 variant frequency. PLOS Computational Biology 20(9): e1012443. https://doi.org/10.1371/journal.pcbi.1012443 | GISAID | All | Point and probabilistic | No |
| gisaid_hier_mlr | A Bayesian hierarchical multinomial logistic regression (MLR) model for nowcasting COVID variants using variant counts based on INSDC sequences. Regression coefficients are modeled hierarchically across locations. | Abousamra E, Figgins M, Bedford T (2024) Fitness models provide accurate short-term forecasts of SARS-CoV-2 variant frequency. PLOS Computational Biology 20(9): e1012443. https://doi.org/10.1371/journal.pcbi.1012443 | INSDC | All | Point and probabilistic | No |
| ensemble | An ensemble of the hub forecasts, created by taking an equally weighted sample of all forecasts that submit samples for a given week, using the function linear_pool from the hubEnsembles package. | https://github.com/hubverse-org/hubEnsembles/tree/main | Other model submission files | All | Point and probabilistic | No |
| PyHMLR | A Bayesian hierarchical multinomial logistic regression model with Dirichlet-Multinomial observation process. The model uses hierarchical hyperpriors to enable partial pooling across locations and clades. Location-specific concentration parameters allow the model to adaptively learn appropriate uncertainty levels for each location. Linear trends are modeled in logit space with standardized time variables. | https://github.com/trobacker/pymc_modeling | COVID Variant Nowcast Hub S3 target data | All | Point and probabilistic | No |