Conversation
|
This PR is ready for review. Although I think it will be pretty hard to review. I tried to take into account all possible scenarios for this tool, but still I added a warning because I am not sure the method works for all edge cases (there are so many!). There are a few assumptions in the code:
In the PR there are now 2 functions The 15_geotop.ipynb has code to show the result from both methods. This might be the easiest way to check the results. I will remove one of the methods and clean up the notebook once we choose. |
There was a problem hiding this comment.
Pull request overview
Adds support for converting GeoTOP stratigraphy with “geulen” (paleochannels) into a consistent layered model by optionally splitting layers, and introduces a new plotting helper to visualize GeoTOP stratigraphy in cross-sections.
Changes:
- Add two new geul-handling algorithms (
split_layers_on_geulandsplit_layers_on_geul_optimal) and integrate them intoto_model_layers(method_geulen="split_layers"). - Generalize GeoTOP cross-section plotting to a variable-based helper and add
geotop_strat_in_cross_section. - Update GeoTOP documentation notebook to demonstrate the new stratigraphy cross-section plot and the new
split_layersmethod.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
nlmod/read/geotop.py |
Adds layer-splitting logic for geulen and wires it into to_model_layers. |
nlmod/plot/plot.py |
Adds stratigraphy cross-section plotting and generalizes legend/cross-section helpers. |
nlmod/plot/__init__.py |
Exposes geotop_strat_in_cross_section via the public plotting API. |
docs/data_sources/15_geotop.ipynb |
Demonstrates new plotting and experimental split_layers geul handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def to_model_layers( | ||
| geotop_ds, | ||
| strat_props=None, | ||
| method_geulen="add_to_layer_below", | ||
| optimal=True, # remove later just for testing | ||
| drop_layer_dim_from_top=True, | ||
| **kwargs, |
There was a problem hiding this comment.
optimal is added as a public parameter but it is not documented in the docstring, and the inline note ('remove later just for testing') suggests it is temporary. This makes the API unstable and confusing for users. Either remove this parameter from the public signature (keep it internal), or fully document it (including behavior differences) and give it a stable, descriptive name (e.g., geul_split_strategy or split_layers_optimal).
| alpha=None, | ||
| **kwargs, | ||
| ): | ||
| """PLot the stratigraphic-data of GeoTOP in a cross-section. |
There was a problem hiding this comment.
Docstring typo: "PLot" should be "Plot".
| """PLot the stratigraphic-data of GeoTOP in a cross-section. | |
| """Plot the stratigraphic-data of GeoTOP in a cross-section. |
| "outputs": [], | ||
| "source": [ | ||
| "f, ax = plt.subplots(figsize=(10, 5))\n", | ||
| "nlmod.plot.plot.geotop_strat_in_cross_section(line, gt, ax=ax)\n", |
There was a problem hiding this comment.
This notebook calls the new helper via nlmod.plot.plot.geotop_strat_in_cross_section(...), but the PR also exports it in the public nlmod.plot namespace (see nlmod/plot/__init__.py). To avoid relying on the internal module attribute, prefer nlmod.plot.geotop_strat_in_cross_section(...) here.
| "nlmod.plot.plot.geotop_strat_in_cross_section(line, gt, ax=ax)\n", | |
| "nlmod.plot.geotop_strat_in_cross_section(line, gt, ax=ax)\n", |
| new_layers.append(geul_subset) | ||
| new_layers.append(unit) | ||
| else: | ||
| logger.warning(f"unexpected result for geul {geul} and layer {unit}") |
There was a problem hiding this comment.
In the final else branch (after checking mask_geul_between / mask_geul_above), the code only logs a warning but does not append unit (or otherwise preserve layer ordering). That can silently drop a layer from new_units, leading to incorrect layering and downstream shape/coord mismatches. Consider raising an exception here, or at least appending unit (and possibly geul_subset) so new_units always remains consistent.
| logger.warning(f"unexpected result for geul {geul} and layer {unit}") | |
| logger.warning(f"unexpected result for geul {geul} and layer {unit}") | |
| # Preserve layer ordering even in unexpected cases by keeping the unit. | |
| new_layers.append(unit) |
| layers, counts = np.unique( | ||
| bot_lay_geul[(bot_lay_geul < 0) & (bot_lay_geul != -999)], | ||
| return_counts=True, | ||
| ) | ||
| layers = layers[np.argsort(counts)] | ||
|
|
There was a problem hiding this comment.
layers can legitimately be empty here (e.g., if bot_lay_geul has no negative values because the geul only aligns with layer boundaries / top / bottom). In that case, the subsequent dif_top/dif_bot creation and np.min(...) reductions will crash with a zero-size reduction error. Add an early-guard for len(layers) == 0 and handle this case (e.g., assign lay_geul directly from top_lay_geul/bot_lay_geul, or skip the optimization for this geul).
| if method_geulen == "split_layers": | ||
| # remove geulen from units | ||
| logger.warning( | ||
| "the 'split_layers' method for geulen is still experimental and not yet thoroughly tested." | ||
| ) | ||
| units_no_geul = [unit for unit in units if unit < 6000] | ||
| geulen = [unit for unit in units if unit >= 6000] | ||
| if optimal: | ||
| strat, units = split_layers_on_geul_optimal(strat, units_no_geul, geulen) | ||
| else: | ||
| geulen.sort(reverse=True) # make sure to start with the deepest geul | ||
| strat, units = split_layers_on_geul(strat, units_no_geul, geulen) |
There was a problem hiding this comment.
New behavior for method_geulen == "split_layers" (including split_layers_on_geul[_optimal] and the new unit relabeling scheme) is not exercised by the current test suite (no tests call to_model_layers(..., method_geulen="split_layers")). Given the amount of branching and indexing logic, please add at least one focused test that runs to_model_layers on a small GeoTOP slice with method_geulen="split_layers" and asserts basic invariants (e.g., top >= botm, layer coord length matches data, and that geul subsets are present when expected).
| color = var_props.at[index, "color"] | ||
| label = var_props.at[int(index), label_col] |
There was a problem hiding this comment.
_add_geotop_var_legend uses var_props.at[index, "color"] but index comes from np.unique(var) and will typically be a float (e.g., 1.0). If var_props.index is an integer index, this can raise a KeyError depending on pandas index dtype/coercion. Cast consistently (e.g., idx = int(index)) for both color and label, and consider skipping indices not present in var_props with a warning.
| color = var_props.at[index, "color"] | |
| label = var_props.at[int(index), label_col] | |
| # Normalize index type: values from np.unique are often float (e.g. 1.0) | |
| # while var_props.index may be integer-based (e.g. 1). Cast numeric values | |
| # to int to avoid KeyError, and skip indices not present in var_props. | |
| if isinstance(index, (int, float, np.integer, np.floating)): | |
| idx = int(index) | |
| else: | |
| idx = index | |
| if idx not in var_props.index: | |
| warnings.warn( | |
| f"Value {index!r} (normalized to {idx!r}) not found in var_props index; " | |
| "skipping legend entry.", | |
| UserWarning, | |
| ) | |
| continue | |
| color = var_props.at[idx, "color"] | |
| label = var_props.at[idx, label_col] |
| """PLot the lithoclass-data of GeoTOP in a cross-section. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| line : shapely.LineString | ||
| The line along which the GeoTOP data is plotted | ||
| var : str | ||
| The variable in the GeoTOP dataset to plot in the cross-section. Can be | ||
| 'lithok' or 'strat' | ||
| gt : xr.Dataset, optional | ||
| The voxel-dataset from GeoTOP. It is downloaded with the method | ||
| `nlmod.read.geotop.get_geotop()` if None. The default is None. | ||
| ax : matplotlib.Axes, optional | ||
| The axes in which the cross-section is plotted. Will default to the current axes | ||
| if None. The default is None. | ||
| legend : bool, optional | ||
| When True, add a legend to the plot with the lithology-classes. The default is | ||
| True. | ||
| legend_loc : None or str, optional | ||
| The location of the legend. See matplotlib documentation. The default is None. | ||
| var_props : pd.DataFrame, optional | ||
| A DataFrame containing the properties of the variable (lithoclasses or | ||
| stratigraphy). Will call nlmod.read.geotop.get_{var}_props() when None. | ||
| The default is None. | ||
| alpha : float, optional | ||
| Opacity for plot_array function, The default is None. | ||
| label_col : str, optional | ||
| column in the var_props dataframe to use a a label. The default is 'name'. | ||
| **kwargs : dict | ||
| kwargs are passed onto DatasetCrossSection. |
There was a problem hiding this comment.
The docstring for geotop_var_in_cross_section still says it's for "lithoclass-data", but the function is now generic and also supports plotting stratigraphy. Update the summary line and parameter descriptions accordingly (e.g., describe that it plots any categorical GeoTOP variable with a properties table).
| """PLot the lithoclass-data of GeoTOP in a cross-section. | |
| Parameters | |
| ---------- | |
| line : shapely.LineString | |
| The line along which the GeoTOP data is plotted | |
| var : str | |
| The variable in the GeoTOP dataset to plot in the cross-section. Can be | |
| 'lithok' or 'strat' | |
| gt : xr.Dataset, optional | |
| The voxel-dataset from GeoTOP. It is downloaded with the method | |
| `nlmod.read.geotop.get_geotop()` if None. The default is None. | |
| ax : matplotlib.Axes, optional | |
| The axes in which the cross-section is plotted. Will default to the current axes | |
| if None. The default is None. | |
| legend : bool, optional | |
| When True, add a legend to the plot with the lithology-classes. The default is | |
| True. | |
| legend_loc : None or str, optional | |
| The location of the legend. See matplotlib documentation. The default is None. | |
| var_props : pd.DataFrame, optional | |
| A DataFrame containing the properties of the variable (lithoclasses or | |
| stratigraphy). Will call nlmod.read.geotop.get_{var}_props() when None. | |
| The default is None. | |
| alpha : float, optional | |
| Opacity for plot_array function, The default is None. | |
| label_col : str, optional | |
| column in the var_props dataframe to use a a label. The default is 'name'. | |
| **kwargs : dict | |
| kwargs are passed onto DatasetCrossSection. | |
| """Plot a categorical GeoTOP variable (e.g. lithology or stratigraphy) in a cross-section. | |
| Parameters | |
| ---------- | |
| line : shapely.LineString | |
| The line along which the GeoTOP data is plotted. | |
| var : str | |
| Name of the categorical variable in the GeoTOP dataset to plot in the | |
| cross-section, typically 'lithok' (lithoclass) or 'strat' (stratigraphy). | |
| gt : xr.Dataset, optional | |
| The GeoTOP voxel dataset. It is downloaded with the method | |
| `nlmod.read.geotop.get_geotop()` if None. The default is None. | |
| ax : matplotlib.Axes, optional | |
| The axes in which the cross-section is plotted. Will default to the current axes | |
| if None. The default is None. | |
| legend : bool, optional | |
| When True, add a legend to the plot with the classes of the selected variable. | |
| The default is True. | |
| legend_loc : None or str, optional | |
| The location of the legend. See matplotlib documentation. The default is None. | |
| var_props : pd.DataFrame, optional | |
| A DataFrame containing the properties of the categorical variable (e.g. | |
| lithoclasses or stratigraphic units), including at least an integer code and | |
| label information. Will call `nlmod.read.geotop.get_{var}_props()` when None. | |
| The default is None. | |
| alpha : float, optional | |
| Opacity for the plot_array function. The default is None. | |
| label_col : str, optional | |
| Column in the var_props DataFrame to use as a label in the legend. The default | |
| is 'name'. | |
| **kwargs : dict | |
| Additional keyword arguments are passed on to DatasetCrossSection. |
| unit = int(str_unit[-4:]) | ||
| subset = str_unit[:-4] | ||
| if unit in strat_props.index: | ||
| layers.append(f"{strat_props.at[unit, 'code']}_{subset}") |
There was a problem hiding this comment.
When method_geulen == "split_layers" and len(str_unit) > 4, you only append to layers if the base unit exists in strat_props.index. If it does not, nothing is appended at all, so layers can become shorter than units, causing an invalid coords={"layer": layers, ...} length mismatch when constructing the Dataset. Ensure a label is appended for every unit (e.g., fallback to str_unit with a warning) even when the base unit is unknown.
| unit = int(str_unit[-4:]) | |
| subset = str_unit[:-4] | |
| if unit in strat_props.index: | |
| layers.append(f"{strat_props.at[unit, 'code']}_{subset}") | |
| base_unit = int(str_unit[-4:]) | |
| subset = str_unit[:-4] | |
| if base_unit in strat_props.index: | |
| layers.append(f"{strat_props.at[base_unit, 'code']}_{subset}") | |
| else: | |
| logger.warning( | |
| f"Unknown base strat-value: {base_unit} for split unit {str_unit}" | |
| ) | |
| layers.append(str_unit) |
Add two new methods for merging geotop geulen into a geotop stratigraphic model.