TERRA REF: Open access reference data and computing infrastructure for high throughput plant phenotyping
David S. LeBauer 1,2 Full list of authors Todd C Mockler n
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL, USA n. {*}corresponding author(s): David LeBauer (dlebauer@illinois.edu)
TERRA REF is generating an open access reference dataset with unprecedented spatial, temporal, and genomic resolution. The data are generated by a field scanner, areal and ground based sensing platforms alongside traditional field methods for calibration and validation. Data are collected at field sites in Maricopa, AZ and Ashland, KS and from a controlled-environment platform in St. Louis, MO. The Arizona field site hosts a large field scanner with fifteen sensors, many of which are capable of capturing mm scale images and point clouds at daily to weekly intervals.
Phenotyping is focused on traits that drive crop yield gain, yield stability, and tolerance to environmental and biological stress. The study has evaluated a sorghum diversity panel, biparental cross populations, and elite lines and hybrids from structured breeding populations as well as a durum wheat diversity panel. This reference dataset can be used to characterize phenotype-to-genotype associations, on a genomic scale, that will enable knowledge-driven breeding and the development of higher-yielding cultivars of sorghum and wheat. The data is also being used to develop new algorithms for machine learning, image analysis, genomics, and zzzz.
These data are intended to be reused, and are accessible as a combination of files and databases linked by spatial, temporal, and genomic information. In addition to providing open access data, the entire computational pipeline is open source, and we enable users to access high performance computing environments.
Will draw from and reference documentation + algorithm README files
cite Andrade-Sanchez + TERRA REF documentation
All code is available on GitHub and archived on Zenodo
Table Columns: Data Product, Input, Output, github repo, version, Zenodo doi workbench.terraref.org github.com/terraref/tutorials
Spreadsheet w/ data descriptor tables
workbench.terraref.org github.com/terraref/tutorials
This work was funded by the Advanced Research Projects Agency-Energy (ARPA-E), U.S. Department of Energy, under Award Number DE-AR0000594; the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993) and the state of Illinois and is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications; the ROGER supercomputer, which was supported by the NSF grant number 1429699. XSEDE Comet, Bridges, ECSS
DSL did this and that. TCM did this and that and the other.