ECCO Dataset Production

ECCO Dataset Production is a toolset that supports NASA's Open Science initiative by making ECCO's multidecadal, physically- and statistically-consistent ocean state estimates available in NetCDF format.

In so doing, it transforms raw MITgcm-generated results into ordered collections of date- and time-stamped files, in native and lon/lat grid formats, for wide use by the broader scientific research community.

ECCO Dataset Production can run either locally or in the cloud, the latter mode in regular use by the ECCO group to generate the multi-terabyte datasets available through the Physical Oceanography Distributed Active Archive Center (PO.DAAC) and NASA's Earthdata ESDIS Project.

Project Dependencies

Much of the core computation in ECCO Dataset Production is provided by xmitgcm, ECCOv4-py, and the cloud utilities package from ECCO-ACCESS.

To this, ECCO Dataset Production adds workflow automation, packaging, and utilities suitable for both local (i.e. custom dataset) and cloud-based (i.e., multi-terabyte) production and distribution.

Installation and Usage

ECCO Dataset Production can be pip-installed as with any other Python package. Just clone the repo, cd to the top-level directory and install:

$ git clone https://github.com/ECCO-GROUP/ECCO-Dataset-Production.git
$ cd ECCO-Dataset-Production
$ pip install .

Dockerfiles, Docker Compose files, and automation scripts have also been included to support local, and AWS-targeted container-based solutions. See ./docker/README.md for details.

ECCO Dataset Production exposes several command-line scripts, two of the more important ones being edp_create_job_task_list for creating a json-formatted explicit list of NetCDF files that are to be produced, and edp_generate_dataproducts that then reads this task list and generates the resulting files. Command-line help is available via:

$ edp_create_job_task_list --help
$ edp_generate_dataproducts --help

Test/demonstration examples illustrating dataset production in local and cloud-based modes are in ./demos. In order to run the demonstration examples, you'll need to install the ECCO-v4-Configurations submodule (that is, unless ECCO-Dataset-Production hasn't originally been cloned using the --recurse-submodules option):

$ git submodule init
$ git submodule update

./demos/native_latlon_local is a useful "getting started" example illustrating generation of local NetCDF files from local input files, with a discussion of problem setup, input formats, and job submittal.

History

Initial dataset production iterations were the work of Ian Fenty, with subsequent prototype AWS Lambda cloud deployment by Ian Fenty and Duncan Bark. The current package is a significant update that includes production tools and scaling for AWS Batch-based cloud deployment, and has been implemented by Ian Fenty and Greg Moore ([email protected]). Release documentation generation tools are the work of Jose Gonzales and Odilon Houndegnonto.

Contributing

Contributions and use case examples are always welcome! Please feel free to fork this repo and issue a pull request or contact the ECCO Group.

Name		Name	Last commit message	Last commit date
Latest commit History 416 Commits
configs		configs
demos		demos
docker		docker
docs		docs
document_generator		document_generator
src/ecco_dataset_production		src/ecco_dataset_production
tests		tests
utils		utils
.env		.env
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
docker-compose.aws.yaml		docker-compose.aws.yaml
docker-compose.dev.yaml		docker-compose.dev.yaml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ECCO Dataset Production

Project Dependencies

Installation and Usage

History

Contributing

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

ECCO-GROUP/ECCO-Dataset-Production

Folders and files

Latest commit

History

Repository files navigation

ECCO Dataset Production

Project Dependencies

Installation and Usage

History

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages