Skip to content

Commit 2c7ffd9

Browse files
seamanticsciencesimone-silvestriglwagner
authored
Read ECCO-Darwin fields from NASA Earthdata for initial conditions and restoring (#328)
* Use ECCODarwin (binary) data to initialize ClimaOcean runs * add `MeshArrays` dependency * add `scale_factor` to allow conversion of `ECCODarwin` units to `ClimaOceanBiogeochemistry` units. * avoid `Method overwriting` warning by removing duplicate `metadata_filename` function * Fix another `method overwrite` warning * review suggestions * Refactor ECCO Darwin code * refine inpainting * apply mask on the native grid in `ECCO_darwin_model_data`. * Add facility for ECCO-Darwin BGC restoring * Add ECCO restoring bgc fields to `oceananigans_fieldnames` * Revisit `MeshArrays` usage working with native LLC grid model output * Updates due to data wrangling refactoring * `ECCO_restoring` included twice during merge * version => dataset in `ECCOMetadata` * Generalize ECCO-Darwin native grid to lat-lon grid interpolation using MeshArrays * Reminder to check ECCO2-Darwin origin * OWSP update * Fix some merge issues * Fix issues with `location` and `variable_is_three_dimensional` for `ECCO4Monthly` and `ECCO4DarwinMonthly` * Testing! * add back "retrieve_data" function that differs for ECCO Darwin compared to EN4 and ECCO4 * fix `retrieve_data` and tests * fix newer conflicts * Add `ECCO4DarwinMonthly` to `test_downloading` * actually add `ECCO4DarwinMonthly` to `test_downloading` * improve performance by reusing ECCO-Darwin native grid interpolation factors * add ECCO4DarwinMonthly phosphate download * Add `Glob` dependency * Update ECCO_darwin.jl * change `SomeECCODataset` to abstract type and recast `ECCO2Monthly`, `ECCO2Daily` and `ECCO4Monthly` * downloading ECCO2 and ECCO4 Darwin monthly data from NASA Earthdata * move "NetCDF shenanigans" to a seperate `retrieve_data` function * modify testing routines * `nan_convert_missing` appears necessary in `_set_2d_metadata_field!` to pass `EN4` tests * `ECCO4DarwinMonthly` testing * add ECCO2Darwin testing * bug fix in `retrieve_data` * add `default_masked_value` * need to mask ECCODarwin output _after_ scaling/offset applied to avoid inpainting failure * rearrange a bit with dicts nearer the top * Apply suggestions from @simone-silvestri code review Co-authored-by: Simone Silvestri <[email protected]> * Apply suggestions from code review Co-authored-by: Simone Silvestri <[email protected]> * tweak some presentation * oops, forgot to flip the mask in the vertical. * reinstate `partially_inpainted_interior` test * include `ECCO2DarwinMonthly` and `ECCO4DarwinMonthly` in `ClimaOcean` module * add `metadata_time_step` and `metadata_epoch` functions * comment on `ECCO_darwin_scale_factor` and `ECCO_darwin_offset_factor` * Update `retrieve_data` docstring Co-authored-by: Gregory L. Wagner <[email protected]> * update `retrieve_data` docstring again * generalize `ECCO_darwin_native_grid` -> `binary_data_grid` and `ECCO_darwin_native_size` -> `binary_data_size`. * change bgc names, e.g. `DIC` -> `:dissolved_inoganic_carbon` and `DissolvedInorganicCarbon()` * rename `SomeECCODataset` -> `ECCODataset` * update `test_names` for `ECCO4DarwinMonthly` and `ECCO2DarwinMonthly` . * Better unit conversion handled by `concentration_units` and applied in `metadata_field` similar to `:temperature` * tweak ECCODarwin `concentration_units` * `concentration_units` docstring * `concentration_units` docstring * update subscripts in silicate and oxygen fieldnames * update `DatasetRestoring` docstring with bgc tracers --------- Co-authored-by: Simone Silvestri <[email protected]> Co-authored-by: Gregory L. Wagner <[email protected]>
1 parent 714ea95 commit 2c7ffd9

File tree

13 files changed

+521
-125
lines changed

13 files changed

+521
-125
lines changed

Project.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,12 @@ DataDeps = "124859b0-ceae-595e-8997-d05f6a7a8dfe"
1414
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
1515
DocStringExtensions = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
1616
Downloads = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
17+
Glob = "c27321d9-0574-5035-807b-f59d2c89b15c"
1718
ImageMorphology = "787d08f9-d448-5407-9aad-5290dd7ab264"
1819
JLD2 = "033835bb-8acc-5ee8-8aae-3f567f8a3819"
1920
KernelAbstractions = "63c18a36-062a-441e-b654-da1e3ab1ce7c"
2021
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
22+
MeshArrays = "cb8c808f-1acf-59a3-9d2b-6e38d009f683"
2123
NCDatasets = "85f8d34a-cbdd-5861-8df4-14fed0d494ab"
2224
Oceananigans = "9e8cae18-63c1-5223-a75c-80ca9d6e9a09"
2325
OffsetArrays = "6fe1bfb0-de20-5000-8ca7-80f57d26f881"

src/ClimaOcean.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ export
3030
EN4Metadatum,
3131
ETOPO2022,
3232
ECCO2Daily, ECCO2Monthly, ECCO4Monthly,
33+
ECCO2DarwinMonthly, ECCO4DarwinMonthly,
3334
EN4Monthly,
3435
GLORYSDaily, GLORYSMonthly, GLORYSStatic,
3536
RepeatYearJRA55, MultiYearJRA55,

src/DataWrangling/DataWrangling.jl

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
module DataWrangling
22

33
export Metadata, Metadatum, ECCOMetadatum, EN4Metadatum, all_dates, first_date, last_date
4+
export metadata_time_step, metadata_epoch
45
export LinearlyTaperedPolarMask
56
export DatasetRestoring
67

@@ -183,6 +184,8 @@ function longitude_interfaces end
183184
function latitude_interfaces end
184185
function reversed_vertical_axis end
185186
function native_grid end
187+
function binary_data_grid end
188+
function binary_data_size end
186189

187190
default_mask_value(dataset) = NaN
188191

@@ -193,6 +196,9 @@ include("metadata_field_time_series.jl")
193196
include("inpainting.jl")
194197
include("restoring.jl")
195198

199+
function metadata_time_step end
200+
function metadata_epoch end
201+
196202
# Only temperature and salinity need a thorough inpainting because of stability,
197203
# other variables can do with only a couple of passes. Sea ice variables
198204
# cannot be inpainted because zeros in the data are physical, not missing values.

src/DataWrangling/ECCO/ECCO.jl

Lines changed: 29 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,9 @@ module ECCO
33
export ECCOMetadatum, ECCO_immersed_grid, adjusted_ECCO_tracers, initialize!
44
export ECCO2Monthly, ECCO4Monthly, ECCO2Daily
55

6+
export ECCO2DarwinMonthly, ECCO4DarwinMonthly
7+
export retrieve_data
8+
69
using Oceananigans
710
using ClimaOcean
811
using NCDatasets
@@ -18,6 +21,8 @@ using ClimaOcean.DataWrangling:
1821
BoundingBox,
1922
metadata_path,
2023
Celsius,
24+
GramPerKilogramMinus35,
25+
MicromolePerLiter,
2126
Metadata,
2227
Metadatum,
2328
download_progress
@@ -34,6 +39,7 @@ import ClimaOcean.DataWrangling:
3439
metadata_filename,
3540
download_dataset,
3641
temperature_units,
42+
concentration_units,
3743
dataset_variable_name,
3844
metaprefix,
3945
longitude_interfaces,
@@ -43,18 +49,23 @@ import ClimaOcean.DataWrangling:
4349
inpainted_metadata_path,
4450
reversed_vertical_axis,
4551
default_mask_value,
46-
available_variables
52+
available_variables,
53+
retrieve_data,
54+
binary_data_grid,
55+
binary_data_size
4756

4857
download_ECCO_cache::String = ""
4958
function __init__()
5059
global download_ECCO_cache = @get_scratch!("ECCO")
5160
end
5261

5362
# Datasets
54-
struct ECCO2Monthly end
55-
struct ECCO2Daily end
56-
struct ECCO4Monthly end
57-
const SomeECCODataset = Union{ECCO2Monthly, ECCO4Monthly, ECCO2Daily}
63+
abstract type ECCODataset end
64+
struct ECCO2Monthly <:ECCODataset end
65+
struct ECCO2Daily <:ECCODataset end
66+
struct ECCO4Monthly <:ECCODataset end
67+
68+
include("ECCO_darwin.jl")
5869

5970
function default_download_directory(::ECCO2Monthly)
6071
path = joinpath(download_ECCO_cache, "v2", "monthly")
@@ -75,24 +86,26 @@ Base.size(::ECCO2Daily, variable) = (1440, 720, 50)
7586
Base.size(::ECCO2Monthly, variable) = (1440, 720, 50)
7687
Base.size(::ECCO4Monthly, variable) = (720, 360, 50)
7788

78-
temperature_units(::SomeECCODataset) = Celsius()
89+
temperature_units(::ECCODataset) = Celsius()
7990
default_mask_value(::ECCO4Monthly) = 0
80-
reversed_vertical_axis(::SomeECCODataset) = true
91+
reversed_vertical_axis(::ECCODataset) = true
8192

8293
const ECCO2_url = "https://ecco.jpl.nasa.gov/drive/files/ECCO2/cube92_latlon_quart_90S90N/"
8394
const ECCO4_url = "https://ecco.jpl.nasa.gov/drive/files/Version4/Release4/interp_monthly/"
8495

8596
# The whole range of dates in the different dataset datasets
86-
all_dates(dataset::SomeECCODataset) = all_dates(dataset, nothing)
87-
all_dates(::ECCO4Monthly, variable) = DateTime(1992, 1, 1) : Month(1) : DateTime(2017, 12, 1)
88-
all_dates(::ECCO2Monthly, variable) = DateTime(1992, 1, 1) : Month(1) : DateTime(2024, 12, 1)
89-
all_dates(::ECCO2Daily, variable) = DateTime(1992, 1, 1) : Day(1) : DateTime(2024, 12, 31)
97+
metadata_epoch(::ECCODataset) = DateTime(1992, 1, 1)
98+
99+
all_dates(dataset::ECCODataset) = all_dates(dataset, nothing)
100+
all_dates(dataset::ECCO4Monthly, variable) = metadata_epoch(dataset) : Month(1) : DateTime(2017, 12, 1)
101+
all_dates(dataset::ECCO2Monthly, variable) = metadata_epoch(dataset) : Month(1) : DateTime(2024, 12, 1)
102+
all_dates(dataset::ECCO2Daily, variable) = metadata_epoch(dataset) : Day(1) : DateTime(2024, 12, 31)
90103

91-
longitude_interfaces(::SomeECCODataset) = (0, 360)
104+
longitude_interfaces(::ECCODataset) = (0, 360)
92105
longitude_interfaces(::ECCO4Monthly) = (-180, 180)
93-
latitude_interfaces(::SomeECCODataset) = (-90, 90)
106+
latitude_interfaces(::ECCODataset) = (-90, 90)
94107

95-
z_interfaces(::SomeECCODataset) = [
108+
z_interfaces(::ECCODataset) = [
96109
-6128.75,
97110
-5683.75,
98111
-5250.25,
@@ -193,8 +206,8 @@ ECCO_location = Dict(
193206
:downwelling_longwave => (Center, Center, Nothing),
194207
)
195208

196-
const ECCOMetadata{D} = Metadata{<:SomeECCODataset, D}
197-
const ECCOMetadatum = Metadatum{<:SomeECCODataset}
209+
const ECCOMetadata{D} = Metadata{<:ECCODataset, D}
210+
const ECCOMetadatum = Metadatum{<:ECCODataset}
198211

199212
"""
200213
ECCOMetadatum(name;
Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
using MeshArrays
2+
using JLD2
3+
using Glob
4+
5+
struct ECCO2DarwinMonthly <:ECCODataset end
6+
struct ECCO4DarwinMonthly <:ECCODataset end
7+
8+
# URLs for the ECCO datasets specific to each version
9+
const ECCO4Darwin_url = "https://ecco.jpl.nasa.gov/drive/files/ECCO2/LLC90/ECCO-Darwin/"
10+
const ECCO2Darwin_url = "https://ecco.jpl.nasa.gov/drive/files/ECCO2/LLC270/ECCO-Darwin_extension/"
11+
12+
Base.size(data::Metadata{<:ECCO4DarwinMonthly}) = (720, 360, 50, length(data.dates))
13+
Base.size(::Metadatum{<:ECCO4DarwinMonthly}) = (720, 360, 50, 1)
14+
Base.size(data::Metadata{<:ECCO2DarwinMonthly}) = (1440, 720, 50, length(data.dates))
15+
Base.size(::Metadatum{<:ECCO2DarwinMonthly}) = (1440, 720, 50, 1)
16+
17+
metadata_time_step(::Metadatum{<:ECCO4DarwinMonthly}) = 3600
18+
metadata_epoch(::Metadatum{<:ECCO4DarwinMonthly}) = DateTime(1992, 1, 1, 12, 0, 0)
19+
20+
metadata_time_step(::Metadatum{<:ECCO2DarwinMonthly}) = 1200
21+
metadata_epoch(::Metadatum{<:ECCO2DarwinMonthly}) = DateTime(1992, 1, 1, 0, 0, 0)
22+
23+
# The whole range of dates in the different dataset datasets
24+
all_dates(dataset::ECCO4DarwinMonthly, name) = metadata_epoch(dataset) : Month(1) : DateTime(2023, 3, 1)
25+
all_dates(dataset::ECCO2DarwinMonthly, name) = metadata_epoch(dataset) : Month(1) : DateTime(2025, 5, 1)
26+
27+
# File name generation specific to each Dataset dataset
28+
"""
29+
metadata_filename(metadata::Metadatum{<:Union{ECCO2DarwinMonthly, ECCO4DarwinMonthly}})
30+
31+
Generate the filename for a given ECCO Darwin dataset and date.
32+
33+
The filename is constructed using the dataset variable name, and the iteration number is calculated
34+
from the date and epoch.
35+
"""
36+
function metadata_filename(metadata::Metadatum{<:Union{ECCO2DarwinMonthly, ECCO4DarwinMonthly}})
37+
shortname = dataset_variable_name(metadata)
38+
39+
reference_date = metadata_epoch(metadata)
40+
timestep_size = metadata_time_step(metadata)
41+
42+
# Explicitly convert to Int to avoid return of a float
43+
iternum = Int(Dates.value((metadata.dates - reference_date) / (timestep_size * 1e3)))
44+
iterstr = string(iternum, pad=10)
45+
46+
return shortname * "." * iterstr * ".data"
47+
end
48+
49+
# Convenience functions
50+
default_mask_value(::ECCO4DarwinMonthly) = 0
51+
default_mask_value(::ECCO2DarwinMonthly) = 0
52+
53+
dataset_variable_name(data::Metadata{<:Union{ECCO2DarwinMonthly,ECCO4DarwinMonthly}}) = ECCO_darwin_dataset_variable_names[data.name]
54+
55+
location(::Metadata{<:Union{ECCO2DarwinMonthly, ECCO4DarwinMonthly}}) = (Center, Center, Center)
56+
57+
variable_is_three_dimensional(::Metadata{<:Union{ECCO2DarwinMonthly, ECCO4DarwinMonthly}}) = true
58+
59+
ECCO_darwin_dataset_variable_names = Dict(
60+
:temperature => "THETA",
61+
:salinity => "SALTanom",
62+
:dissolved_inorganic_carbon => "DIC",
63+
:alkalinity => "ALK",
64+
:phosphate => "PO4",
65+
:nitrate => "NO3",
66+
:dissolved_organic_phosphorus => "DOP",
67+
:particulate_organic_phosphorus => "POP",
68+
:dissolved_iron => "FeT",
69+
:dissolved_silicate => "SiO2",
70+
:dissolved_oxygen => "O2",
71+
)
72+
73+
"""
74+
concentration_units(metadatum::Metadatum{<:Union{ECCO2DarwinMonthly, ECCO4DarwinMonthly}})
75+
76+
Set up conversion from the ECCODarwin output data to standard units
77+
- salinity = SALTanom + 35
78+
- biogeochemical tracer concentrations are in uL => umol/L in the output files from Darwin
79+
"""
80+
function concentration_units(metadatum::Metadatum{<:Union{ECCO2DarwinMonthly, ECCO4DarwinMonthly}})
81+
if dataset_variable_name(metadatum) == "SALTanom"
82+
return GramPerKilogramMinus35()
83+
elseif dataset_variable_name(metadatum) != "THETA"
84+
return MicromolePerLiter()
85+
else
86+
return nothing
87+
end
88+
end
89+
90+
function default_download_directory(::ECCO4DarwinMonthly)
91+
path = joinpath(download_ECCO_cache, "v4_darwin", "monthly")
92+
return mkpath(path)
93+
end
94+
95+
function default_download_directory(::ECCO2DarwinMonthly)
96+
path = joinpath(download_ECCO_cache, "v2_darwin", "monthly")
97+
return mkpath(path)
98+
end
99+
100+
metadata_url(m::Metadata{<:ECCO4DarwinMonthly}) = ECCO4Darwin_url * "monthly/" * dataset_variable_name(m) * "/" * metadata_filename(m)
101+
metadata_url(m::Metadata{<:ECCO2DarwinMonthly}) = ECCO2Darwin_url * "monthly/" * dataset_variable_name(m) * "/" * metadata_filename(m)
102+
103+
# Functions for reading the ECCO binary files using MeshArrays
104+
binary_data_grid(::ECCO4DarwinMonthly) = GridSpec(ID=:LLC90)
105+
binary_data_size(::ECCO4DarwinMonthly) = (90, 1170, 50)
106+
binary_data_grid(::ECCO2DarwinMonthly) = GridSpec(ID=:LLC270)
107+
binary_data_size(::ECCO2DarwinMonthly) = (270, 3510, 50)
108+
109+
longitude_interfaces(::ECCO4DarwinMonthly) = (-180, 180)
110+
111+
"""
112+
retrieve_data(metadata::Metadatum{<:ECCO4DarwinMonthly})
113+
114+
Read a ECCO4DarwinMonthly data file and regrid using MeshArrays on to regular lat-lon grid
115+
"""
116+
function retrieve_data(metadata::Metadatum{<:Union{ECCO4DarwinMonthly, ECCO2DarwinMonthly}})
117+
native_size = binary_data_size(metadata.dataset)
118+
native_grid = binary_data_grid(metadata.dataset)
119+
native_data = zeros(Float32, prod(native_size)) # Native LLC grid at precision of the input binary file
120+
121+
read!(metadata_path(metadata), native_data)
122+
native_data = bswap.(native_data)
123+
124+
meshed_data = read(reshape(native_data, native_size...), native_grid)
125+
Nx, Ny, Nz, _ = size(metadata)
126+
data = zeros(Float32, Nx, Ny, Nz) # Native LLC grid at precision of the input binary file
127+
mask = zeros(Float32, Nx, Ny, Nz)
128+
129+
# Download the native grid data from MeshArrays repo (only if not in already in datadeps)
130+
native_grid_coords = GridLoad(native_grid; option="full")
131+
132+
# Check if the interpolation coefficients are already calculated
133+
interp_file = joinpath(dirname(metadata_path(metadata)),"native_interp_coeffs.jld2")
134+
if !isfile(interp_file)
135+
# Calculate coefficients to interpolate from native grid to regular lat-lon grid (as in the ECCO netcdf files)
136+
resolution_X = 360/Nx
137+
resolution_Y = 180/Ny
138+
139+
# Regular lat-lon grid
140+
longitudes = longitude_interfaces(metadata.dataset)
141+
latitudes = latitude_interfaces(metadata.dataset)
142+
lon = [i for i = longitudes[1]+resolution_X/2:resolution_X:longitudes[2]-resolution_X/2,
143+
j = latitudes[1]+resolution_Y/2:resolution_Y:latitudes[2]-resolution_Y/2]
144+
lat = [j for i = longitudes[1]+resolution_X/2:resolution_X:longitudes[2]-resolution_X/2,
145+
j = latitudes[1]+resolution_Y/2:resolution_Y:latitudes[2]-resolution_Y/2]
146+
147+
# Interpolation factors for the native grid (writes out to a file "interp_file" for later use)
148+
coeffs = interpolation_setup(; Γ=native_grid_coords, lat, lon, filename=interp_file)
149+
else
150+
# Read the coefficients from the file that was previously calculated
151+
coeffs = interpolation_setup(interp_file)
152+
end
153+
154+
# Read continental mask on the native model grid
155+
native_grid_fac_center = GridLoadVar("hFacC", native_grid)
156+
157+
# Interpolate each masked layer on to the native lat lon grid
158+
for k in 1:Nz
159+
i, j, c = MeshArrays.Interpolate(
160+
meshed_data[:, k],
161+
coeffs,
162+
)
163+
data[:, :, k] = c
164+
i, j, c = MeshArrays.Interpolate(
165+
land_mask(native_grid_fac_center[:, k]),
166+
coeffs,
167+
)
168+
mask[:, :, k] = c
169+
end
170+
171+
# Reverse the z-axis
172+
data = reverse(data, dims=3)
173+
mask = reverse(mask, dims=3)
174+
175+
# Fill NaNs in Antarctica with zeros
176+
data[isnan.(data)] .= default_mask_value(metadata.dataset)
177+
178+
return data .* mask
179+
end

src/DataWrangling/metadata.jl

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -261,6 +261,20 @@ struct Kelvin end
261261

262262
temperature_units(metadata) = Celsius()
263263

264+
struct MolePerKilogram end
265+
struct MolePerLiter end
266+
struct MillimolePerKilogram end
267+
struct MillimolePerLiter end
268+
struct MicromolePerKilogram end
269+
struct MicromolePerLiter end
270+
struct NanomolePerKilogram end
271+
struct NanomolePerLiter end
272+
273+
struct GramPerKilogramMinus35 end # Salinity anomaly
274+
struct MilliliterPerLiter end # Sometimes for disssolved_oxygen
275+
276+
concentration_units(metadata) = nothing
277+
264278
#####
265279
##### Utilities
266280
#####

0 commit comments

Comments
 (0)