Skip to content

Commit 41667e8

Browse files
committed
[WIP/DOC] Preparing new example model
1 parent 3bf56a9 commit 41667e8

File tree

1 file changed

+390
-0
lines changed

1 file changed

+390
-0
lines changed

examples/examples/real/mik.py

Lines changed: 390 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,390 @@
1+
# %% md
2+
# <img src="https://docs.gempy.org/_static/logos/gempy.png" alt="drawing" width="400"/>
3+
#
4+
# # 2.2 From Drillhole Data to GemPy Model
5+
#
6+
# %%
7+
# List of relative paths used during the workshop
8+
import pooch
9+
10+
path_to_well_png = '../../common/basics/data/wells.png'
11+
path_to_checkpoint_1 = '../../common/basics/checkpoints/checkpoint1.pickle'
12+
path_to_checkpoint_2 = '../../common/basics/checkpoints/checkpoint2.pickle'
13+
upgrade_pickles = False
14+
# %%
15+
# Importing GemPy
16+
import gempy as gp
17+
18+
#
19+
# %% md
20+
# We use `pooch` to download the dataset into a temp file:
21+
# %%
22+
url = "https://raw.githubusercontent.com/softwareunderground/subsurface/main/tests/data/borehole/kim_ready.csv"
23+
# known_hash = "efa90898bb435daa15912ca6f3e08cd3285311923a36dbc697d2aafebbafa25f"
24+
known_hash = "a91445cb960526398e25d8c1d2ab3b3a32f7d35feaf33e18887629b242256ab6"
25+
26+
# Your code here:
27+
raw_borehole_data_csv = pooch.retrieve(url, known_hash)
28+
29+
# %% md
30+
# Now we can use `subsurface` function to help us reading csv files into pandas dataframes that the package can understand. Since the combination of styles data is provided can highly vary from project to project, `subsurface` provides some *helpers* functions to parse different combination of .csv
31+
# %%
32+
reading_collars = ReaderFilesHelper(
33+
file_or_buffer=raw_borehole_data_csv,
34+
index_col="name",
35+
usecols=['x', 'y', 'altitude', "name"]
36+
)
37+
38+
reading_collars
39+
# %%
40+
from dataclasses import asdict
41+
42+
asdict(reading_collars)
43+
# %%
44+
collar = read_collar(reading_collars)
45+
46+
collar
47+
# %%
48+
# Your code here:
49+
survey = read_survey(
50+
ReaderFilesHelper(
51+
file_or_buffer=raw_borehole_data_csv,
52+
index_col="name",
53+
usecols=["name", "md"]
54+
)
55+
)
56+
57+
survey
58+
# %%
59+
# Your code here:
60+
lith = read_lith(
61+
ReaderFilesHelper(
62+
file_or_buffer=raw_borehole_data_csv,
63+
usecols=['name', 'top', 'base', 'formation'],
64+
columns_map={'top' : 'top',
65+
'base' : 'base',
66+
'formation': 'component lith',
67+
}
68+
)
69+
)
70+
71+
lith
72+
73+
# %% md
74+
# ### Welly
75+
#
76+
# Welly is a family of classes to facilitate the loading, processing, and analysis of subsurface wells and well data, such as striplogs, formation tops, well log curves, and synthetic seismograms.
77+
#
78+
# We are using welly to convert pandas data frames into classes to manipulate well data. The final goal is to extract 3D coordinates and properties for multiple wells.
79+
#
80+
# The class `WellyToSubsurfaceHelper` contains the methods to create a `welly` project and export it to a `subsurface` data class.
81+
# %%
82+
wts = sb.reader.wells.WellyToSubsurfaceHelper(collar_df=collar, survey_df=survey, lith_df=lith)
83+
# %% md
84+
# In the field p is stored a welly project (https://github.com/agile-geoscience/welly/blob/master/tutorial/04_Project.ipynb)and we can use it to explore and visualize properties of each well.
85+
# %%
86+
wts.p
87+
# %%
88+
stripLog = wts.p[0].data['lith']
89+
stripLog
90+
# %%
91+
stripLog.plot()
92+
plt.gcf()
93+
# %%
94+
welly_well = wts.p[0].data["lith_log"]
95+
welly_well
96+
# %% md
97+
# ## Welly to Subsurface
98+
#
99+
# Welly is a very powerful tool to inspect well data but it was not design for 3D. However they have a method to export XYZ coordinates of each of the well that we can take advanatage of to create a `subsurface.UnstructuredData` object. This object is one of the core data class of `subsurface` and we will use it from now on to keep working in 3D.
100+
# %%
101+
formations = ["topo", "etchegoin", "macoma", "chanac", "mclure",
102+
"santa_margarita", "fruitvale",
103+
"round_mountain", "olcese", "freeman_jewett", "vedder", "eocene",
104+
"cretaceous",
105+
"basement", "null"]
106+
107+
unstruct = sb.reader.wells.welly_to_subsurface(wts, table=[Component({'lith': l}) for l in formations])
108+
unstruct.data
109+
# %% md
110+
# At each core `UstructuredData` is a wrapper of a `xarray.Dataset`. Although slightly flexible, any `UnstructuredData` will contain 4 `xarray.DataArray` objects containing vertex, cells, cell attributes and vertex attibutes. This is the minimum amount of information necessary to work in 3D.
111+
# %% md
112+
# From an `UnstructuredData` we can construct *elements*. *elements* are a higher level construct and includes the definion of type of geometric representation - e.g. points, lines, surfaces, etc. For the case of borehole we will use LineSets. *elements* have a very close relation to `vtk` data structures what enables easily to plot the data using `pyvista`
113+
# %%
114+
element = sb.LineSet(unstruct)
115+
pyvista_mesh = sb.visualization.to_pyvista_line(element, radius=50)
116+
117+
# Plot default LITH
118+
interactive_plot = sb.visualization.pv_plot([pyvista_mesh], background_plotter=True)
119+
# %%
120+
plot_pyvista_to_notebook(interactive_plot)
121+
# %% md
122+
# ## Finding the boreholes bases
123+
#
124+
# `GemPy` interpolates the bottom of a unit, therefore we need to be able to extract those points to be able tointerpolate them. `xarray`, `pandas` and `numpy` are using the same type of memory representation what makes possible to use the same or at least similar methods to manipulate the data to our will.
125+
#
126+
# Lets find the base points of each well:
127+
# %%
128+
# Creating references to the xarray.DataArray
129+
cells_attr = unstruct.data.cell_attrs
130+
cells = unstruct.data.cells
131+
vertex = unstruct.data.vertex
132+
# %%
133+
# Find vertex points at the boundary of two units
134+
# Marking each vertex
135+
bool_prop_change = cells_attr.values[1:] != cells_attr.values[:-1]
136+
# Getting the index of the vertex
137+
args_prop_change = np.where(bool_prop_change)[0]
138+
# Getting the attr values at those points
139+
vals_prop_change = cells_attr[args_prop_change]
140+
vals_prop_change.to_pandas()
141+
# %%
142+
# Getting the vertex values at those points
143+
vertex_args_prop_change = cells[args_prop_change, 1]
144+
interface_points = vertex[vertex_args_prop_change]
145+
interface_points
146+
# %%
147+
# Creating a new UnstructuredData
148+
interf_us = sb.UnstructuredData.from_array(vertex=interface_points.values, cells="points",
149+
cells_attr=vals_prop_change.to_pandas())
150+
interf_us
151+
# %% md
152+
# This new `UnstructuredData` object instead containing data that represent lines, contain point data at the bottom of each unit. We can plot it very similar as before:
153+
# %%
154+
element = sb.PointSet(interf_us)
155+
pyvista_mesh = sb.visualization.to_pyvista_points(element)
156+
interactive_plot.add_mesh(pyvista_mesh)
157+
# %%
158+
plot_pyvista_to_notebook(interactive_plot)
159+
# %% md
160+
# ## GemPy: Initialize model
161+
#
162+
# The first step to create a GemPy model is create a `gempy.Model` object that will
163+
# contain all the other data structures and necessary functionality. In addition
164+
# for this example we will define a *regular grid* since the beginning.
165+
# This is the grid where we will interpolate the 3D geological model.
166+
#
167+
# GemPy is based on a **meshless interpolator**. In practice this means that we can
168+
# interpolate any point in a 3D space. However, for convenience, we have built some
169+
# standard grids for different purposes. At the current day the standard grids are:
170+
#
171+
# - **Regular grid**: default grid mainly for general visualization
172+
# - **Custom grid**: GemPy's wrapper to interpolate on a user grid
173+
# - **Topography**: Topographic data use to be of high density. Treating it as an independent
174+
# grid allow for high resolution geological maps
175+
# - **Sections**: If we predefine the section 2D grid we can directly interpolate at those
176+
# locations for perfect, high resolution estimations
177+
# - **Center grids**: Half sphere grids around a given point at surface. This are specially tuned
178+
# for geophysical forward computations
179+
# %%
180+
import gempy as gp
181+
182+
extent = [275619, 323824, 3914125, 3961793, -3972.6, 313.922]
183+
184+
# Your code here:
185+
geo_model = gp.create_model("getting started")
186+
geo_model.set_regular_grid(extent=extent, resolution=[50, 50, 50])
187+
# %% md
188+
# GemPy core code is written in Python. However for efficiency and gradient based
189+
# machine learning most of heavy computations happen in optimize compile code,
190+
# either C or CUDA for GPU.
191+
#
192+
# To do so, GemPy rely on the library `Theano`. To guarantee maximum optimization
193+
# `Theano` requires to compile the code for every Python kernel. The compilation is
194+
# done by calling the following line at any point (before computing the model):
195+
# %%
196+
gp.set_interpolator(geo_model, theano_optimizer='fast_compile', verbose=[])
197+
198+
199+
# %% md
200+
# ## Making a model step by step.
201+
#
202+
# The temptation at this point is to bring all the points into `gempy` and just interpolate. However, often that strategy results in ill posed problems due to noise or irregularities in the data. `gempy` has been design to being able to iterate rapidly and therefore a much better workflow use to be creating the model step by step.
203+
#
204+
# To do that, lets define a function that we can pass the name of the formation and get the assotiated vertex. Grab from the `interf_us` the XYZ coordinates of the first layer:
205+
# %%
206+
def get_interface_coord_from_surfaces(surface_names: list, verbose=False):
207+
df = pd.DataFrame(columns=["X", "Y", "Z", "surface"])
208+
209+
for e, surface_name in enumerate(surface_names):
210+
# The properties in subsurface start at 1
211+
val_property = formations.index(surface_name) + 1
212+
# Find the cells with the surface id
213+
args_from_first_surface = np.where(vals_prop_change == val_property)[0]
214+
if verbose: print(args_from_first_surface)
215+
# Find the vertex
216+
points_from_first_surface = interface_points[args_from_first_surface]
217+
if verbose: print(points_from_first_surface)
218+
219+
# xarray.DataArray to pandas.DataFrame
220+
surface_pandas = points_from_first_surface.to_pandas()
221+
222+
# Add formation column
223+
surface_pandas["surface"] = surface_name
224+
df = df.append(surface_pandas)
225+
226+
return df.reset_index()
227+
228+
229+
# %% md
230+
# ### Surfaces
231+
#
232+
# GemPy is a surface based interpolator. This means that all the input data we add has to be refereed to a surface. The
233+
# surfaces always mark the **bottom** of a unit.
234+
#
235+
# This is a list with the formations names for this data set.
236+
# %%
237+
formations
238+
# %% md
239+
# Lets add the first two (remember we always need a basement defined).
240+
# %%
241+
geo_model.add_surfaces(formations[:2])
242+
# %% md
243+
# Using the function defined above we just extract the surface points for **topo**:
244+
# %%
245+
gempy_surface_points = get_interface_coord_from_surfaces(["topo"])
246+
gempy_surface_points
247+
# %% md
248+
# And we can set them into the `gempy` model:
249+
# %%
250+
geo_model.set_surface_points(gempy_surface_points, update_surfaces=False)
251+
geo_model.update_to_interpolator()
252+
# %%
253+
g2d = gp.plot_2d(geo_model)
254+
# %%
255+
g2d.fig
256+
# %% md
257+
# The **minimum amount of input data** to interpolate anything in `gempy` is:
258+
#
259+
# a) 2 surface points per surface
260+
#
261+
# b) One orientation per series.
262+
#
263+
# Lets add an orientation:
264+
# %%
265+
geo_model.add_orientations(X=300000, Y=3930000, Z=0, surface="topo", pole_vector=(0, 0, 1))
266+
# %% md
267+
# GemPy depends on multiple data objects to store all the data structures necessary
268+
# to construct an structural model. To keep all the necessary objects in sync the
269+
# class `gempy.ImplicitCoKriging` (which `geo_model` is instance of) will provide the
270+
# necessary methods to update these data structures coherently.
271+
#
272+
# At current state (gempy 2.2), the data classes are:
273+
#
274+
# - `gempy.SurfacePoints`
275+
# - `gempy.Orientations`
276+
# - `gempy.Surfaces`
277+
# - `gempy.Stack` (combination of `gempy.Series` and `gempy.Faults`)
278+
# - `gempy.Grid`
279+
# - `gempy.AdditionalData`
280+
# - `gempy.Solutions`
281+
#
282+
# Today we will look into details only some of these classes but what is important
283+
# to notice is that you can access these objects as follows:
284+
# %%
285+
geo_model.additional_data
286+
# %%
287+
gp.compute_model(geo_model)
288+
# %%
289+
g3d = gp.plot_3d(geo_model, plotter_type="background")
290+
# %%
291+
g3d.p.add_mesh(pyvista_mesh)
292+
# %%
293+
plot_pyvista_to_notebook(g3d.p)
294+
# %% md
295+
# ## Second layer
296+
# %%
297+
geo_model.add_surfaces(formations[2])
298+
# %%
299+
gempy_surface_points = get_interface_coord_from_surfaces(formations[:2])
300+
geo_model.set_surface_points(gempy_surface_points, update_surfaces=False)
301+
geo_model.update_to_interpolator()
302+
# %%
303+
gp.compute_model(geo_model)
304+
# %%
305+
live_plot = gp.plot_3d(geo_model, plotter_type="background", show_results=True)
306+
# %%
307+
plot_pyvista_to_notebook(live_plot.p)
308+
# %%
309+
live_plot.toggle_live_updating()
310+
# %% md
311+
# ### Trying to fix the model: Multiple Geo. Features/Series
312+
# %%
313+
geo_model.add_features("Formations")
314+
# %%
315+
geo_model.map_stack_to_surfaces({"Form1": ["etchegoin", "macoma"]}, set_series=False)
316+
# %%
317+
geo_model.add_orientations(X=300000, Y=3930000, Z=0, surface="etchegoin", pole_vector=(0, 0, 1), idx=1)
318+
# %%
319+
gp.compute_model(geo_model)
320+
# %%
321+
h3d = gp.plot_3d(geo_model, plotter_type="background", show_lith=False, ve=5)
322+
# %%
323+
plot_pyvista_to_notebook(h3d.p)
324+
# %% md
325+
# ## Last layers for today
326+
# %%
327+
geo_model.add_surfaces(formations[3:5])
328+
# %%
329+
f_last = formations[:4]
330+
f_last
331+
# %%
332+
gempy_surface_points = get_interface_coord_from_surfaces(f_last)
333+
geo_model.set_surface_points(gempy_surface_points, update_surfaces=False)
334+
geo_model.update_to_interpolator()
335+
# %%
336+
gp.compute_model(geo_model)
337+
# %%
338+
p3d_4 = gp.plot_3d(geo_model, plotter_type="background", show_lith=False, ve=5)
339+
# %%
340+
plot_pyvista_to_notebook(p3d_4.p)
341+
# %%
342+
geo_model.add_orientations(X=321687.059770, Y=3.945955e+06, Z=0, surface="etchegoin", pole_vector=(0, 0, 1), idx=1)
343+
gp.compute_model(geo_model)
344+
p3d_4.plot_surfaces()
345+
# %%
346+
geo_model.add_orientations(X=277278.652995, Y=3.929298e+06, Z=0, surface="etchegoin", pole_vector=(0, 0, 1), idx=2)
347+
gp.compute_model(geo_model)
348+
p3d_4.plot_surfaces()
349+
# %% md
350+
# ## Adding many more orientations
351+
# %%
352+
# find neighbours
353+
neighbours = gp.select_nearest_surfaces_points(geo_model, geo_model._surface_points.df, 2)
354+
355+
# calculate all fault orientations
356+
new_ori = gp.set_orientation_from_neighbours_all(geo_model, neighbours)
357+
new_ori.df.head()
358+
# %%
359+
gp.compute_model(geo_model)
360+
# %%
361+
p3d_4.plot_orientations()
362+
p3d_4.plot_surfaces()
363+
# %%
364+
p3d_4.p.add_mesh(pyvista_mesh)
365+
366+
# %%
367+
plot_pyvista_to_notebook(p3d_4.p)
368+
# %% md
369+
# -----
370+
#
371+
# ## Thank you for your attention
372+
#
373+
#
374+
# #### Extra Resources
375+
#
376+
# Page:
377+
# https://www.gempy.org/
378+
#
379+
# Paper:
380+
# https://www.gempy.org/theory
381+
#
382+
# Gitub:
383+
# https://github.com/cgre-aachen/gempy
384+
#
385+
# #### Further training and collaborations
386+
# https://www.terranigma-solutions.com/
387+
#
388+
# ![Terranigma_Logotype_black.png](attachment:200622_Terranigma_Logotype_black.png)
389+
#
390+
#

0 commit comments

Comments
 (0)