Skip to content

Fix NotImplementedError in xr.merge() for object-dtype geometry columns#586

Draft
Copilot wants to merge 2 commits intocopilot/fix-recursion-error-spatial-aggregationfrom
copilot/fix-notimplementederror-xr-merge
Draft

Fix NotImplementedError in xr.merge() for object-dtype geometry columns#586
Copilot wants to merge 2 commits intocopilot/fix-recursion-error-spatial-aggregationfrom
copilot/fix-notimplementederror-xr-merge

Conversation

Copy link

Copilot AI commented Feb 24, 2026

Newer xarray raises NotImplementedError when xr.merge() is called on datasets containing object-dtype arrays (Shapely Polygon/MultiPolygon geometries), breaking 9 tests across xarray IO and spatial aggregation paths.

Changes

  • fine/IOManagement/utilsIO.py: Added merge_with_geometry(ds1, ds2, **merge_kwargs) helper that strips object-dtype variables before merging, then re-attaches them via direct assignment. Replaced all 5 xr.merge() call sites.
  • fine/IOManagement/xarrayIO.py: Imported merge_with_geometry from utilsIO and replaced all 6 xr.merge() call sites.
def merge_with_geometry(ds1, ds2, **merge_kwargs):
    """Merge two xarray Datasets, handling object-dtype (geometry) variables separately
    to avoid NotImplementedError in newer xarray versions."""
    obj_vars = {}
    for ds in [ds1, ds2]:
        for var in ds.data_vars:
            if ds[var].dtype == object:
                obj_vars[var] = ds[var]

    ds1_clean = ds1.drop_vars([v for v in obj_vars if v in ds1.data_vars])
    ds2_clean = ds2.drop_vars([v for v in obj_vars if v in ds2.data_vars])

    merged = xr.merge([ds1_clean, ds2_clean], **merge_kwargs)

    for var, da in obj_vars.items():
        merged[var] = da

    return merged
Original prompt

Fix NotImplementedError in xr.merge() caused by geometry object-dtype columns

Background

The CI job at https://github.com/FZJ-IEK3-VSA/FINE/actions/runs/22326619335/job/64683587352 (on branch copilot/fix-recursion-error-spatial-aggregation) tests a specific version of xarray and fails with NotImplementedError on 9 tests. All failures originate from calls to xr.merge() when the dataset contains columns with Shapely geometry objects (object dtype).

In newer versions of xarray, xr.merge() raises NotImplementedError when attempting to merge datasets that contain object-dtype arrays holding non-scalar Python objects such as Shapely geometries (Polygon, MultiPolygon, etc.). These geometry values must be preserved — they cannot be cast to a numeric dtype.

Failing tests

  • test/IOManagement/test_xarrayio.py::test_input_esm_to_netcdf_and_back
  • test/IOManagement/test_xarrayio.py::test_output_esm_to_netcdf_and_back
  • test/IOManagement/test_xarrayio.py::test_output_esm_to_netcdf_and_back_perfectForesight
  • test/IOManagement/test_xarrayio.py::test_capacityFix_subset
  • test/IOManagement/test_xarrayio.py::test_saving_clustered_timeseries_to_xarray
  • test/aggregations/spatialAggregation/test_manager.py::test_esm_to_xr_and_back_during_spatial_aggregation[False]
  • test/aggregations/spatialAggregation/test_manager.py::test_esm_to_xr_and_back_during_spatial_aggregation[True]
  • test/test_endogenousTechnologicalLearning.py::test_etl_NPV
  • test/test_flexibleConversion.py::test_flexibleConversion_groups

Root cause

In fine/IOManagement/xarrayIO.py and fine/IOManagement/utilsIO.py, there are many call sites where xr.merge() is used to incrementally build up xarray datasets. For example:

xr_dss[ip][name][component] = xr.merge(
    [xr_dss[ip][name][component], xr_da],
    combine_attrs="drop_conflicts",
)

and

xr_ds[this_class][this_comp] = xr.merge(
    [xr_ds[this_class][this_comp], this_ds_component]
)

When any of the datasets involved contain variables with object-dtype arrays (i.e., Shapely geometry objects), newer xarray raises NotImplementedError.

Required fix

At every xr.merge() call site in fine/IOManagement/xarrayIO.py and fine/IOManagement/utilsIO.py, implement the following pattern to separate geometry columns from the merge:

  1. Before merging, identify all variables with object dtype (geometry columns) in both datasets being merged.
  2. Extract those geometry variables from both datasets.
  3. Merge only the non-geometry (numeric) part of the datasets using xr.merge() as before.
  4. Re-attach the geometry variables to the merged result by direct assignment (e.g., merged_ds[var] = geometry_da), taking the geometry from whichever dataset contains it (they should be identical or one dataset may not have it yet).

This approach preserves all geometry data intact while avoiding the NotImplementedError in newer xarray versions.

A helper function should be introduced to avoid code duplication across the many call sites, for example:

def merge_with_geometry(ds1, ds2, **merge_kwargs):
    """Merge two xarray Datasets, handling object-dtype (geometry) variables separately
    to avoid NotImplementedError in newer xarray versions."""
    # Collect object-dtype variables from both datasets
    obj_vars = {}
    for ds in [ds1, ds2]:
        for var in ds.data_vars:
            if ds[var].dtype == object:
                obj_vars[var] = ds[var]
    
    # Drop object-dtype variables before merging
    ds1_clean = ds1.drop_vars([v for v in obj_vars if v in ds1.data_vars])
    ds2_clean = ds2.drop_vars([v for v in obj_vars if v in ds2.data_vars])
    
    # Merge the non-geometry parts
    merged = xr.merge([ds1_clean, ds2_clean], **merge_kwargs)
    
    # Re-attach geometry variables
    for var, da in obj_vars.items():
        merged[var] = da
    
    return merged

Then replace all xr.merge([ds1, ds2], ...) calls in those files with merge_with_geometry(ds1, ds2, ...).

Files to change

  • fine/IOManagement/xarrayIO.py — multiple xr.merge() call sites
  • fine/IOManagement/utilsIO.py — multiple xr.merge() call sites

Branch to base the PR on

Base the PR on branch: copilot/fix-recursion-error-spatial-aggregation

This pull request was created from Copilot chat.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…elper

Co-authored-by: julian-belina <56728940+julian-belina@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix NotImplementedError in xr.merge() for geometry object-dtype columns Fix NotImplementedError in xr.merge() for object-dtype geometry columns Feb 24, 2026
Copilot AI requested a review from julian-belina February 24, 2026 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants