Skip to content

add new methods for geulen#556

Open
OnnoEbbens wants to merge 5 commits intodevfrom
geotop_geulen
Open

add new methods for geulen#556
OnnoEbbens wants to merge 5 commits intodevfrom
geotop_geulen

Conversation

@OnnoEbbens
Copy link
Copy Markdown
Collaborator

@OnnoEbbens OnnoEbbens commented Mar 27, 2026

Add two new methods for merging geotop geulen into a geotop stratigraphic model.

@OnnoEbbens
Copy link
Copy Markdown
Collaborator Author

OnnoEbbens commented Mar 28, 2026

This PR is ready for review. Although I think it will be pretty hard to review.

I tried to take into account all possible scenarios for this tool, but still I added a warning because I am not sure the method works for all edge cases (there are so many!).

There are a few assumptions in the code:

  • A geul unit with a higher number (e.g. 6420) is always below a geul unit with a lower number (e.g. 6010).
  • A geul unit is never interrupted by another unit in the vertical. Meaning there is never another unit in between geul units when looking at a column. This does not hold the other way around, a geul unit can be both above and below another unit.

In the PR there are now 2 functions split_layers_on_geul and split_layers_on_geul_optimal. As the name suggest I think the last one is better as it is slightly faster and the final layer model has less layers. However the code is definitely more complex and more steps are involved. I mainly left both methods in so we can easily go back to that method if there are problems with the 'optimal' one.

The 15_geotop.ipynb has code to show the result from both methods. This might be the easiest way to check the results. I will remove one of the methods and clean up the notebook once we choose.

@OnnoEbbens OnnoEbbens marked this pull request as draft March 28, 2026 16:36
@OnnoEbbens OnnoEbbens marked this pull request as ready for review March 28, 2026 16:36
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for converting GeoTOP stratigraphy with “geulen” (paleochannels) into a consistent layered model by optionally splitting layers, and introduces a new plotting helper to visualize GeoTOP stratigraphy in cross-sections.

Changes:

  • Add two new geul-handling algorithms (split_layers_on_geul and split_layers_on_geul_optimal) and integrate them into to_model_layers(method_geulen="split_layers").
  • Generalize GeoTOP cross-section plotting to a variable-based helper and add geotop_strat_in_cross_section.
  • Update GeoTOP documentation notebook to demonstrate the new stratigraphy cross-section plot and the new split_layers method.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

File Description
nlmod/read/geotop.py Adds layer-splitting logic for geulen and wires it into to_model_layers.
nlmod/plot/plot.py Adds stratigraphy cross-section plotting and generalizes legend/cross-section helpers.
nlmod/plot/__init__.py Exposes geotop_strat_in_cross_section via the public plotting API.
docs/data_sources/15_geotop.ipynb Demonstrates new plotting and experimental split_layers geul handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 450 to 456
def to_model_layers(
geotop_ds,
strat_props=None,
method_geulen="add_to_layer_below",
optimal=True, # remove later just for testing
drop_layer_dim_from_top=True,
**kwargs,
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimal is added as a public parameter but it is not documented in the docstring, and the inline note ('remove later just for testing') suggests it is temporary. This makes the API unstable and confusing for users. Either remove this parameter from the public signature (keep it internal), or fully document it (including behavior differences) and give it a stable, descriptive name (e.g., geul_split_strategy or split_layers_optimal).

Copilot uses AI. Check for mistakes.
alpha=None,
**kwargs,
):
"""PLot the stratigraphic-data of GeoTOP in a cross-section.
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring typo: "PLot" should be "Plot".

Suggested change
"""PLot the stratigraphic-data of GeoTOP in a cross-section.
"""Plot the stratigraphic-data of GeoTOP in a cross-section.

Copilot uses AI. Check for mistakes.
"outputs": [],
"source": [
"f, ax = plt.subplots(figsize=(10, 5))\n",
"nlmod.plot.plot.geotop_strat_in_cross_section(line, gt, ax=ax)\n",
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This notebook calls the new helper via nlmod.plot.plot.geotop_strat_in_cross_section(...), but the PR also exports it in the public nlmod.plot namespace (see nlmod/plot/__init__.py). To avoid relying on the internal module attribute, prefer nlmod.plot.geotop_strat_in_cross_section(...) here.

Suggested change
"nlmod.plot.plot.geotop_strat_in_cross_section(line, gt, ax=ax)\n",
"nlmod.plot.geotop_strat_in_cross_section(line, gt, ax=ax)\n",

Copilot uses AI. Check for mistakes.
new_layers.append(geul_subset)
new_layers.append(unit)
else:
logger.warning(f"unexpected result for geul {geul} and layer {unit}")
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the final else branch (after checking mask_geul_between / mask_geul_above), the code only logs a warning but does not append unit (or otherwise preserve layer ordering). That can silently drop a layer from new_units, leading to incorrect layering and downstream shape/coord mismatches. Consider raising an exception here, or at least appending unit (and possibly geul_subset) so new_units always remains consistent.

Suggested change
logger.warning(f"unexpected result for geul {geul} and layer {unit}")
logger.warning(f"unexpected result for geul {geul} and layer {unit}")
# Preserve layer ordering even in unexpected cases by keeping the unit.
new_layers.append(unit)

Copilot uses AI. Check for mistakes.
Comment on lines +374 to +379
layers, counts = np.unique(
bot_lay_geul[(bot_lay_geul < 0) & (bot_lay_geul != -999)],
return_counts=True,
)
layers = layers[np.argsort(counts)]

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

layers can legitimately be empty here (e.g., if bot_lay_geul has no negative values because the geul only aligns with layer boundaries / top / bottom). In that case, the subsequent dif_top/dif_bot creation and np.min(...) reductions will crash with a zero-size reduction error. Add an early-guard for len(layers) == 0 and handle this case (e.g., assign lay_geul directly from top_lay_geul/bot_lay_geul, or skip the optimization for this geul).

Copilot uses AI. Check for mistakes.
Comment on lines +512 to +523
if method_geulen == "split_layers":
# remove geulen from units
logger.warning(
"the 'split_layers' method for geulen is still experimental and not yet thoroughly tested."
)
units_no_geul = [unit for unit in units if unit < 6000]
geulen = [unit for unit in units if unit >= 6000]
if optimal:
strat, units = split_layers_on_geul_optimal(strat, units_no_geul, geulen)
else:
geulen.sort(reverse=True) # make sure to start with the deepest geul
strat, units = split_layers_on_geul(strat, units_no_geul, geulen)
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior for method_geulen == "split_layers" (including split_layers_on_geul[_optimal] and the new unit relabeling scheme) is not exercised by the current test suite (no tests call to_model_layers(..., method_geulen="split_layers")). Given the amount of branching and indexing logic, please add at least one focused test that runs to_model_layers on a small GeoTOP slice with method_geulen="split_layers" and asserts basic invariants (e.g., top >= botm, layer coord length matches data, and that geul subsets are present when expected).

Copilot uses AI. Check for mistakes.
Comment on lines +539 to +540
color = var_props.at[index, "color"]
label = var_props.at[int(index), label_col]
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_add_geotop_var_legend uses var_props.at[index, "color"] but index comes from np.unique(var) and will typically be a float (e.g., 1.0). If var_props.index is an integer index, this can raise a KeyError depending on pandas index dtype/coercion. Cast consistently (e.g., idx = int(index)) for both color and label, and consider skipping indices not present in var_props with a warning.

Suggested change
color = var_props.at[index, "color"]
label = var_props.at[int(index), label_col]
# Normalize index type: values from np.unique are often float (e.g. 1.0)
# while var_props.index may be integer-based (e.g. 1). Cast numeric values
# to int to avoid KeyError, and skip indices not present in var_props.
if isinstance(index, (int, float, np.integer, np.floating)):
idx = int(index)
else:
idx = index
if idx not in var_props.index:
warnings.warn(
f"Value {index!r} (normalized to {idx!r}) not found in var_props index; "
"skipping legend entry.",
UserWarning,
)
continue
color = var_props.at[idx, "color"]
label = var_props.at[idx, label_col]

Copilot uses AI. Check for mistakes.
Comment on lines +407 to +436
"""PLot the lithoclass-data of GeoTOP in a cross-section.

Parameters
----------
line : shapely.LineString
The line along which the GeoTOP data is plotted
var : str
The variable in the GeoTOP dataset to plot in the cross-section. Can be
'lithok' or 'strat'
gt : xr.Dataset, optional
The voxel-dataset from GeoTOP. It is downloaded with the method
`nlmod.read.geotop.get_geotop()` if None. The default is None.
ax : matplotlib.Axes, optional
The axes in which the cross-section is plotted. Will default to the current axes
if None. The default is None.
legend : bool, optional
When True, add a legend to the plot with the lithology-classes. The default is
True.
legend_loc : None or str, optional
The location of the legend. See matplotlib documentation. The default is None.
var_props : pd.DataFrame, optional
A DataFrame containing the properties of the variable (lithoclasses or
stratigraphy). Will call nlmod.read.geotop.get_{var}_props() when None.
The default is None.
alpha : float, optional
Opacity for plot_array function, The default is None.
label_col : str, optional
column in the var_props dataframe to use a a label. The default is 'name'.
**kwargs : dict
kwargs are passed onto DatasetCrossSection.
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for geotop_var_in_cross_section still says it's for "lithoclass-data", but the function is now generic and also supports plotting stratigraphy. Update the summary line and parameter descriptions accordingly (e.g., describe that it plots any categorical GeoTOP variable with a properties table).

Suggested change
"""PLot the lithoclass-data of GeoTOP in a cross-section.
Parameters
----------
line : shapely.LineString
The line along which the GeoTOP data is plotted
var : str
The variable in the GeoTOP dataset to plot in the cross-section. Can be
'lithok' or 'strat'
gt : xr.Dataset, optional
The voxel-dataset from GeoTOP. It is downloaded with the method
`nlmod.read.geotop.get_geotop()` if None. The default is None.
ax : matplotlib.Axes, optional
The axes in which the cross-section is plotted. Will default to the current axes
if None. The default is None.
legend : bool, optional
When True, add a legend to the plot with the lithology-classes. The default is
True.
legend_loc : None or str, optional
The location of the legend. See matplotlib documentation. The default is None.
var_props : pd.DataFrame, optional
A DataFrame containing the properties of the variable (lithoclasses or
stratigraphy). Will call nlmod.read.geotop.get_{var}_props() when None.
The default is None.
alpha : float, optional
Opacity for plot_array function, The default is None.
label_col : str, optional
column in the var_props dataframe to use a a label. The default is 'name'.
**kwargs : dict
kwargs are passed onto DatasetCrossSection.
"""Plot a categorical GeoTOP variable (e.g. lithology or stratigraphy) in a cross-section.
Parameters
----------
line : shapely.LineString
The line along which the GeoTOP data is plotted.
var : str
Name of the categorical variable in the GeoTOP dataset to plot in the
cross-section, typically 'lithok' (lithoclass) or 'strat' (stratigraphy).
gt : xr.Dataset, optional
The GeoTOP voxel dataset. It is downloaded with the method
`nlmod.read.geotop.get_geotop()` if None. The default is None.
ax : matplotlib.Axes, optional
The axes in which the cross-section is plotted. Will default to the current axes
if None. The default is None.
legend : bool, optional
When True, add a legend to the plot with the classes of the selected variable.
The default is True.
legend_loc : None or str, optional
The location of the legend. See matplotlib documentation. The default is None.
var_props : pd.DataFrame, optional
A DataFrame containing the properties of the categorical variable (e.g.
lithoclasses or stratigraphic units), including at least an integer code and
label information. Will call `nlmod.read.geotop.get_{var}_props()` when None.
The default is None.
alpha : float, optional
Opacity for the plot_array function. The default is None.
label_col : str, optional
Column in the var_props DataFrame to use as a label in the legend. The default
is 'name'.
**kwargs : dict
Additional keyword arguments are passed on to DatasetCrossSection.

Copilot uses AI. Check for mistakes.
Comment on lines +550 to +553
unit = int(str_unit[-4:])
subset = str_unit[:-4]
if unit in strat_props.index:
layers.append(f"{strat_props.at[unit, 'code']}_{subset}")
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When method_geulen == "split_layers" and len(str_unit) > 4, you only append to layers if the base unit exists in strat_props.index. If it does not, nothing is appended at all, so layers can become shorter than units, causing an invalid coords={"layer": layers, ...} length mismatch when constructing the Dataset. Ensure a label is appended for every unit (e.g., fallback to str_unit with a warning) even when the base unit is unknown.

Suggested change
unit = int(str_unit[-4:])
subset = str_unit[:-4]
if unit in strat_props.index:
layers.append(f"{strat_props.at[unit, 'code']}_{subset}")
base_unit = int(str_unit[-4:])
subset = str_unit[:-4]
if base_unit in strat_props.index:
layers.append(f"{strat_props.at[base_unit, 'code']}_{subset}")
else:
logger.warning(
f"Unknown base strat-value: {base_unit} for split unit {str_unit}"
)
layers.append(str_unit)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants