Last updated: 2020-04-04
You'll need the following:
Install them using your favorite method (homebrew, etc).
First, fork the repository so you're ready to contribute back.
Replace yourusername below with your Github username:
git clone --recursive git@github.com:yourusername/coronadatascraper.git
cd coronadatascraper
git remote add upstream git@github.com:lazd/coronadatascraper.git
If you've already cloned without --recursive, run:
git submodule init
git submodule update
yarn install
If you get an error message saying you have an incompatible version of
node, you may need to change version. You can use n to change
node versions: install it and run
n lts.
yarn start
This gets you the latest scrapers, as well as the cache so we're not hammering servers.
git pull upstream master --recurse-submodules
Note: If you are encountering issues updating a submodule such as Could not access submodule, you may need to update your fork using:
git submodule update --init --recursive
To run the scrapers for today:
yarn start
To use a subset of scrapers, use --location/-l
yarn start --location "US/PA"
The location value should match a path under src/shared/scrapers/.
Examples:
yarn start --location "US": run all scrapers insrc/shared/scrapers/USand child foldersyarn start --location "US/CA": run all scrapers insrc/shared/scrapers/US/CAand child foldersyarn start --location "US/CA/alameda-county.js": run this single scraperyarn start --location "AU": run all scrapers insrc/shared/scrapers/AUand child foldersyarn start --location "AU/index.js": run this single scraper
To skip a scraper, use --skip/-s
yarn start --skip "US/CA/alameda-county.js"
To re-generate old data from cache (or timeseries), use --date/-d:
yarn start -d 2020-3-12
To output files without the date suffix, use --outputSuffix/-o:
yarn start -d 2020-3-12 -o
To generate a timeseries for the entire history of the pandemic using cached data:
yarn timeseries
To generate it for a date range, use -d/-e:
yarn timeseries -d 2020-3-15 -e 2020-3-18
This can be combined with -l to test a single scraper:
yarn timeseries -d 2020-3-15 -e 2020-3-18 -l 'WA, USA'
Run yarn options to see the command line options. e.g.,
Options:
--version Show version number [boolean]
--date, -d Generate data for or start the timeseries at the provided
date in YYYY-M-D format [string]
...
We use Tape.
# Run all tests
yarn test
# Run a single test file
node path/to/file.js
To build the website and start a development server at http://localhost:3000/:
yarn dev
To build the latest data, a full timeseries, and the website:
yarn build
To build only the website for production:
yarn buildSite