NASA Climate Projections

Description

This repository houses the scripts used to process the climate projection data on a 25 sq km resolution. The daily time series data for 3 different climate variables (precipitation, minimum temperature, maximum temperature) exist in the form of NetCDF files. A single file contains the daily data for one climate variable over one year for one model/emissions scenario combination. Each time series is associated with the lat/lon coordinates of the centroid of the spatial grid cell. The 1950 - 2005 time period represents the "historical" period of the climate model while the 2006 - 2100 time period represents the projected period.

Climate NetCDF files are aggregated by model/scenario, creating a single CSV containing all climate variables over the entire time series. Currently there are 21 total climate/scenario combinations processed. Each CSV is then loaded into the "nasa" PostgreSQL database on osensei as a new table named for the model. Each table has 5 columns:

cellid - The unique climate grid cell ID
date - The date associated with climate record
pr - Daily precipition (mm)
tasmax - Maximum daily temperature (degrees C)
tasmin - Minimum daily temperature (degrees C)

cellid	date	pr	tasmax	tasmin
733	1/1/1950	0.327715	7.72772	3.26541
732	1/1/1950	0.268319	7.80816	3.23666
739	1/1/1950	0.234809	8.15731	3.26116
738	1/1/1950	0.202288	8.52435	3.10851
744	1/1/1950	0.171578	8.95794	3.20104

Table 1 - Climate table example

A separate spatial relationship table is created to match the catchment FEATUREIDs from the NHDHRDV2. Catchments are paired with the "cellid" of the grid cell that their centroid falls within. The "cross_grid" table has 4 columns:

featureid - The unique catchment ID
cellid - The unique climate grid cell ID
lat - The latitude of the climate grid cell centroid
lon - The longitude of the climate grid cell centroid

featureid	lon	lat	cellid
201407661	-68.375	48.125	3428
201407662	-68.375	48.125	3428
201407707	-68.625	48.125	3430
201407728	-68.375	47.875	3429
201407770	-68.625	47.875	118

Table 2 - Cross grid table example

The cross_grid table may be used with the individual climate time series tables to query the records assigned to specific catchments (see the "Sample Query" section at the bottom of this page). Users of the data should be aware of the distinction between "historical" and projected data described above.

Data Source

The climate data originates from the NASA Earth Exchange Global Daily Downscaled Climate Projections dataset. The raw projections have been downscaled and clipped to the NHDHRDV2 study area by Ambarish Karmalkar in the Northeast Climate Science Center at UMass Amherst. The subset of the orignal climate data serves as input to the processing steps herein.

Create Climate Tables

The processing scripts are structured to handle a number of models with the same spatial coverage. The NetCDF files for each model/scenario are stored in a single directory. All model/scenario folders are stored in the same parent directory. This storage format allows the script to loop through each folder separately and access all of its NetCDF files. The output CSV is given the same name as the model/scenario folder, which is consistent with the table name in the database.

The scripts in this section are stored in the create_records sub-directory.

1) Create spatial grid

Description

The spatial grid associated with the climate models is extracted from a sample NetCDF file using the extractGrid.R script. This script exports the "gridCentroids.csv" file to be used in determining spatial relationships with the catchments. The spatial grid is consistent across all NetCDF files for all model/scenario combinations. For this reason, it is only necessary to use a single NetCDF file to define the universally applied climate grid cells.

Steps to execute

To run, open the extractGrid.R script and set the following variables in the "Specify inputs" section:

netcdfPath - the file path to the sample NetCDF file defining the climate grid
outputDirectory - the directory where the spatial grid table will be written

The script is then executed in R.

Output

The output table is named gridCentroids.csv and has two columns for "lat" and "lon" coordinates. Unique IDs are assigned later.

2) Establish spatial relationships

Description

A table relating the catchments to climate grid is developed. The spatial grid created in step one is used along with the catchments spatial layer to generate the output table. The relationships are determined by where in the climate grid the catchment centroids fall. In this process, a unique "cellID" is assigned to each of the climate grid cells. Each unique catchment "FEATUREID" is assigned a climate grid "cellID".

Steps to execute

To run, open the spatialReleationships.py script and set the following variables in the "Specify inputs" section.

catchments - the filepath to the spatial catchments layer (shapefile or geodatabase feature)
gridPointsTable - the filepath to the spatial grid table created in the previous step
outputDirectory - the path to the directory where the relationship table will be written

The script is then executed in Arc Python.

Output

The catchmentsGridTable.dbf is ouput to the specified "outputDirectory". This table has columns for the unique IDs for the catchments ("FEATUREID") and grid cells ("cellID") along with the "lat" and "lon" of the grid cell centroid.

3) Convert Table Format

Description

When directly outputting a CSV from ArcGIS, column formatting is problematic due to the presence of commas in large integers. This short script is intended to reformat the table from DBF to CSV for upload to the database.

Steps to execute

To run, open the spatialGridFormat.R script and set the following variables in the "Specify inputs" section.

baseDirectory - the path to the directory containing the catchmentsGridTable.dbf output by the previous script

The script is then executed in R.

Output

The catchmentsGridTable.csv is ouput to the same directory as the DBF version of the same table. The columns are the same as the DBF table.

4) Aggregate Records

Description

Each model/emissions scenario has around 450 raw NetCDF files for the historic and prediction periods. There is one file per year for each of 3 climate variables. This script loops through each of these files aggregating the records into one table for output. The table format is a daily time series of each of the 3 climate variables for each grid cell as shown in Table 1. The records are only kept in the aggregation if the climate grid cell overlaps with a part of the catchments layer polygon. The script is designed to loop through and aggregate climate records for all of the model scenario folders in the parent directory.

Steps to execute

To run, open the aggregateRecords.R script and set the following variables in the "Specify inputs" section.

sourceDirectory - The file path to the parent directory containing all of the model scenario folders.
variables - A list of the climate variables to process.
gridTable - The file path to the DBF table, created in step 2, identifying which climate grid cell each catchment is falls into.
outputDirectory - The directory where the CSV files of aggregated climate records for each model will be written.

The script is then executed in R.

Output

One CSV file for each model/scenario ID (taken from the folder names) will be created. Table 1 describes the format of each of the files.

Create Climate Projections Database

Output

The cross_grid table is created in the "public" schema.

6) Sample Query

Description

A sample query is written in SQL to test the time to return records and provide users with a script to work with for pulling records. Estimates of current times to return climate records for the entire time period are listed in Table 3.

Number of FeatureIDs	Time To Run
10	2-5 seconds
100	10 - 15 seconds
1000	1.5 - 4 minutes

Table 3 - Query Run Times

Steps to execute

Execute the sample_query.sql script in postgres.

Output

The "select_records" table is a time series of climate records for the specified featureids. This table is the same as Table 1 with one new row indicating featureid.

Assumptions & Notes

The spatial grid related to the climate date is identical across all variable/year combinations for all model/scenario combinations.
In models that do not account for leap years, February 29th is the date ommited. This is accomplished by identifying the number of days in known leap years and then altering the calendar if it is a day short.
Some models only go up to 2099.

Contact Info

Kyle O'Neil
koneil@usgs.gov

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
scripts		scripts
.gitignore		.gitignore
Climate Projections Workflow.pdf		Climate Projections Workflow.pdf
README.md		README.md
README_for_source_data.txt		README_for_source_data.txt

Folders and files

Latest commit

History

Repository files navigation

NASA Climate Projections

Description

Data Source

Create Climate Tables

1) Create spatial grid

Description

Steps to execute

Output

2) Establish spatial relationships

Description

Steps to execute

Output

3) Convert Table Format

Description

Steps to execute

Output

4) Aggregate Records

Description

Steps to execute

Output

Create Climate Projections Database

1) Create Database

Description

Steps to execute

Output

2) Create Group Roles

Description

Steps to execute

Output

3) Grant Roles to Users

Description

Steps to execute

Output

4) Upload Climate Tables

Description

Steps to execute

Output

5) Create Cross Grid Table

Description

Steps to execute

Output

6) Sample Query

Description

Steps to execute

Output

Assumptions & Notes

Contact Info

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages