Skip to content

Allow FieldTimeSeries to be read from NetCDF without reading the original architecture#5271

Open
tomchor wants to merge 6 commits intomainfrom
tc/fix-FTS-netcdf-arch
Open

Allow FieldTimeSeries to be read from NetCDF without reading the original architecture#5271
tomchor wants to merge 6 commits intomainfrom
tc/fix-FTS-netcdf-arch

Conversation

@tomchor
Copy link
Collaborator

@tomchor tomchor commented Feb 10, 2026

On main, if I try to read from a GPU-generated NetCDF file into a FieldTimeSeries I get the following error if I don't have a GPU:

ERROR: MethodError: no method matching GPU()
The type `GPU` exists, but no method is defined for this combination of argument types when trying to construct it.

Closest candidates are:
  GPU(::D) where D
   @ Oceananigans ~/repos/Oceananigans.jl/src/Architectures.jl:45

Stacktrace:
  [1] top-level scope
    @ none:1
  [2] eval(m::Module, e::Any)
    @ Core ./boot.jl:489
  [3] materialize_from_netcdf
    @ ~/repos/Oceananigans.jl/ext/OceananigansNCDatasetsExt/utils.jl:121 [inlined]
  [4] (::OceananigansNCDatasetsExt.var"#materialize_from_netcdf##0#materialize_from_netcdf##1")(::Pair{String, String})
    @ OceananigansNCDatasetsExt ./none:-1
  [5] iterate
    @ ./generator.jl:48 [inlined]
  [6] grow_to!(dest::OrderedCollections.OrderedDict{…}, itr::Base.Generator{…})
    @ Base ./abstractdict.jl:595
  [7] dict_with_eltype(DT_apply::OrderedCollections.var"#OrderedDict##0#OrderedDict##1", kv::Base.Generator{…}, t::Type)
    @ Base ./abstractdict.jl:649
  [8] OrderedDict
    @ ~/.julia/packages/OrderedCollections/Xihhq/src/ordered_dict.jl:72 [inlined]
  [9] materialize_from_netcdf
    @ ~/repos/Oceananigans.jl/ext/OceananigansNCDatasetsExt/utils.jl:118 [inlined]
 [10] |>
    @ ./operators.jl:972 [inlined]
 [11] reconstruct_grid(ds::NCDataset{Nothing, Missing})
    @ OceananigansNCDatasetsExt ~/repos/Oceananigans.jl/ext/OceananigansNCDatasetsExt/grid_reconstruction.jl:318
 [12] FieldTimeSeries(typed_path::Oceananigans.OutputReaders.NetCDFPath, name::String; backend::InMemory{…}, architecture::CPU, grid::Nothing, location::Nothing, boundary_conditions::Oceananigans.OutputReaders.UnspecifiedBoundaryConditions, time_indexing::Oceananigans.OutputReaders.Linear, iterations::Nothing, times::Nothing, reader_kw::@NamedTuple{})
    @ OceananigansNCDatasetsExt ~/repos/Oceananigans.jl/ext/OceananigansNCDatasetsExt/output_readers.jl:44

This PR makes it so that we can bypass this by passing an architecture keyword to reconstruct_grid() and, if the user passes it, we never have to instantiate the original architecture the file was written in.

Here's a MWE for the record:

using Oceananigans
using NCDatasets

grid = RectilinearGrid(size=(4, 4, 4), extent=(1, 1, 1))
model = NonhydrostaticModel(grid)
simulation = Simulation(model, Δt=0.1, stop_iteration=10)
simulation.output_writers[:nc] = NetCDFWriter(model, (; u = model.velocities.u),
                            schedule=IterationInterval(1),
                            filename = "fake_gpu_output.nc")
run!(simulation)

# Now we're going to modify the NetCDF file to pretend it was written on a GPU
ds = NCDataset("fake_gpu_output.nc", "a")
group_args = ds.group["underlying_grid_reconstruction_args"]
group_args.attrib["architecture"] = "GPU()"
close(ds)

u = FieldTimeSeries("fake_gpu_output.nc", "u", architecture=CPU())
for n in eachindex(u.times)
    @show u[n][1, 1, 1]
end

@codecov
Copy link

codecov bot commented Feb 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.51%. Comparing base (ae70538) to head (16fe463).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5271   +/-   ##
=======================================
  Coverage   73.51%   73.51%           
=======================================
  Files         395      395           
  Lines       22190    22193    +3     
=======================================
+ Hits        16313    16316    +3     
  Misses       5877     5877           
Flag Coverage Δ
buildkite 68.79% <100.00%> (+<0.01%) ⬆️
julia 68.79% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

end

function reconstruct_grid(ds)
function reconstruct_grid(ds; arch=nothing)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why arch=nothing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also the kwarg should be architecture if using a name (that's what FieldTimeSeries uses as well)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think a positional arg would be fine here; in that case you can keep arch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is only called by FieldTimeSeries, then I suggest just making arch a mandatory positional argument. architecture=nothing isn't valid.

Note that we cannot save grids with a "native architecture". This produces problems in a variety of places, both with GPU grids and distributed grids (although the distributed case is to be solved in the future, not yet). Thus CPU() is default. GPU() is supported as an option, if we want to do analysis on GPU. But there is no concept of reconstructing a grid on the architecture that it was saved on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default behavior should be to reconstruct the grid in the same architecture as it was originally constructed, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related issue: #5238

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the rationale:

* the default becomes invalid _most_ of the time (because we cannot reconstruct distributed fields or GPU fields)

* likewise, _most_ of the time (100% of the time in our current examples) we load from file to CPU

* even if we can deserialize a grid directly onto an available GPU device, we cannot do this for fields. In other words, CuArray are serialized to Array. Likewise GPU grid should be serialized to CPU grid.

Why do you think it would be preferred to change the default behavior? Just as an example, this would require peppering most post processing scripts with architecture=CPU() -- quite a bit of boilerplate.

I don't understand the statement that it's not possible to reconstruct the grid on a GPU. Does this have to do with serialization to JLD2? On NetCDF, serialization isn't possible, and the grid is reconstructed by just re-issuing the construction_arguments(grid), which are saved on the NetCDF. So we don't serialize and then de-serialize. We just re-build the grid with arguments that ensure the new grid is the same as the original grid that generated the NetCDF file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't have a GPU you cannot reconstruct a grid on GPU.

Copy link
Collaborator Author

@tomchor tomchor Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's obviously true and hence there's an architecture kwarg, but if do have a GPU surely you'd like to rebuild a GPU-written grid on it, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also could you please answer my questions in #5271 (comment)? I'm confused as to why you're saying it's not possible to reconstruct a grid on a GPU.

Copy link
Member

@glwagner glwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some significant changes asked for, but once those are done i am happy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants