Skip to content

Code Documentation

Abhinav Tushar edited this page Jan 21, 2018 · 5 revisions

This page documents the pipeline and directory structure of the scripts/processes involved in the collaborative ensemble repository.

As a high level overview, the process looks like this:

  1. Component model files (up to the last season) are collected in ./model-forecasts/component-models.
  2. Weights are generated for different ensemble types and are kept in ./weights.
  3. During the prediction season (current season with real time data), the real time submission files (from the same set of components as used in weight estimation) are used to generate ensemble submissions each week.

Note that the first two steps are run in the beginning of the season, providing fixed weights to the components for the prediction season. Read on to know more about the implementation.

Implementation

The repository on github has submission files in csv format and utility scripts written in R and JavaScript. Submission csv files are kept in ./model-forecasts/ and are categorized in the following sub directories:

  • ./model-forecasts/component-models: Component model submissions for past years. Used for weight estimation.
  • ./model-forecasts/cv-ensemble-models: Ensemble files generated after performing leave-one-season-out cross validation on the component files from past years.
  • ./model-forecasts/real-time-component-models: Real time component submissions for the current season.
  • ./model-forecasts/real-time-ensemble-models: Real time ensemble files created from real time component files.
  • ./model-forecasts/submissions: Files to be submitted to CDC from the best ensemble model (fixed in the beginning of the season).

There are three set of scripts grouped according to the level they work on:

  1. Data file test

    This is a single script ./test-data.js which does superficial checks (without actually reading the content of files) on the csvs inside ./model-forecasts.

    It is written in JavaScript (needs nodejs) and can be invoked using the following commands in the project root directory:

    npm install # Install dependencies for tests
    npm run test-data
  2. TODO

  3. Scripts for visualization deployment.

Clone this wiki locally