@@ -40,16 +40,16 @@ The processing steps of ranging and reconstructing are documented as two special
4040
4141Spatial or other type of filters which are frequently used for atom probe to select specific atom positions
4242or portions of the data based on isotopic identity are modeled as base classes for filters, which are
43- defined atom-probe-agnostic empower reuse:
43+ defined atom-probe-agnostic empowering reuse:
4444
4545:ref: `NXdelocalization `
4646 Base class to describe the delocalization of point-like objects on a grid.
47-
47+
4848:ref: `NXisocontour `
4949 Computational geometry description of isocontouring/phase-fields in Euclidean space.
50-
50+
5151:ref: `NXmatch_filter `
52- Base class to filter ions based on their type or other descriptors like hit multiplicity.
52+ Base class to filter ions based on their type or on other properties like hit multiplicity.
5353
5454:ref: `NXspatial_filter `
5555 Base class to filter based on position. This base class takes advantage of :ref: `NXcg_ellipsoid `,
@@ -68,58 +68,47 @@ defined atom-probe-agnostic empower reuse:
6868Tools and applications in APM
6969#############################
7070
71- There exist several research software tools in the APM community that deal with handling and analyzing APM data.
72-
71+ Several research software tools exist in the APM community to analyze APM data.
7372One of these is the `paraprobe-toolbox <https://paraprobe-toolbox.readthedocs.io/ >`_
7473The software is developed by `M. Kühbach et al. <https://arxiv.org/abs/2205.13510 >`_.
7574
7675The paraprobe-toolbox is an example of an open-source parallelized software for analyzing
7776point cloud data, for assessing meshes in 3D continuum space, and for studying the effects of
78- parameterization on descriptors of micro- and nanoscale structural features (crystal defects)
79- within materials when characterized and studied with atom probe.
77+ parameterization on descriptors of micro- and nanoscale structural features (crystal defects).
8078
81- There is a set of contributed application definitions describing each computational step in the
82- paraprobe-toolbox. These were added to describe the whole workflow in this particular software,
83- but can also act as a blueprint for how computational steps of other software tools
84- (including commercial ones) could be developed further to benefit from NeXus.
79+ There is a set of contributed application definitions which describe each computational step
80+ performed by tools of the paraprobe-toolbox. This is a blueprint of how NeXus can be used
81+ for documentation also computational steps of other software tools (including commercial ones).
8582
8683The need for a thorough documentation of the tools was motivated by several needs:
8784
8885First, users of software would like to better understand and also be able to study for themselves
8986which individual parameters and settings for each tool exist and how configuring these
90- affects analyses quantitatively. This stresses the aspect how to improve documentation.
87+ affect analyses quantitatively. This stresses the aspect how to improve documentation.
9188
92- Second, scientific software like paraprobe-toolbox implement numerical/algorithmical
93- (computational) workflows whereby data coming from multiple input sources
94- (like previous analysis results) are processed and carried through more involved analyses
95- within several steps inside the tool. The tool then creates output as files. This
96- provenance and workflow should be documented .
89+ Second, scientific software like paraprobe-toolbox implements workflows with numerics
90+ and algorithms that process data from multiple input sources (like previous analysis results),
91+ and carry these data through multiple steps inside the tool. The tool then creates output as files.
92+ This provenance and workflow should be documented for reproducibility (the "R" of the FAIR principles
93+ of data stewardship) .
9794
98- Individual tools of paraprobe-toolbox are developed in C/C++ and/or Python.
99- Provenance tracking is useful as it is one component and requirement for making
100- workflows exactly numerically reproducible and thus to enable reproducibility
101- (the "R" of the FAIR principles of data stewardship).
95+ Individual tools of the paraprobe-toolbox are developed in C/C++ or Python. Each of these tools
96+ instructs a workflow that takes three steps each:
97+ 1. The creation of a configuration file.
98+ 2. The actual analysis using a given Python/or C/C++ tool from the toolbox with results summarized in a results file.
99+ 3. The optional analyses/visualization of the results based on data in NeXus/HDF5 files generated by each tool.
102100
103- For tools of the paraprobe-toolbox each workflow step is a pair or triple of sub-steps:
104- 1. The creation of a configuration file.
105- 2. The actual analysis using a given Python/or C/C++ tool from the toolbox.
106- 3. The optional analyses/visualization of the results based on data in NeXus/HDF5 files generated by each tool.
107-
108- Data and metadata between the tools are exchanged with NeXus/HDF5 files. This means that data
101+ Data and metadata between the tools are exchanged via NeXus/HDF5 files. This means that data
109102inside HDF5 binary containers are named, formatted, and hierarchically structured according
110- to NeXus application definitions.
111-
112- In a refactoring project, within the FAIRmat project, which is part of the `German
113- National Research Data Infrastructure <https://www.nfdi.de/?lang=en> `_, the tools of the
114- paraprobe-toolbox were modified to read from and write data using NeXus application definitions.
103+ to NeXus application definitions. These definitions are specific for each tool:
115104
116105For example the application definition :ref: `NXapm_paraprobe_surfacer_config `: specifies
117- the expectation how a configuration file for the paraprobe-surfacer tool is formatted
118- and which parameters it contains including optionality and cardinality constraints.
119-
120- Thereby, each config file uses a controlled vocabulary of terms. The config files store
121- SHA256 checksum for each input file, thereby implementing an uninterrupted
122- provenance tracking chain documenting the computational workflow.
106+ the expected data formatting of a configuration file for the paraprobe-surfacer tool.
107+ The application definition defines which parameters are expected, which of these
108+ are optional, and if specific cardinality constraints exist. Each config file defines
109+ a controlled vocabulary of terms. The config files store SHA256 checksum for each input file,
110+ thereby implementing an uninterrupted provenance tracking chain that encodes
111+ the computational workflow.
123112
124113As an example, a user may first range their reconstruction and then compute spatial
125114correlation functions. The config file for the ranging tool stores the files
@@ -131,32 +120,38 @@ imported by the spatial statistics tool which again keeps track of all files
131120and reports its results in a spatial statistics tool results file.
132121
133122This design makes it possible to rigorously trace which numerical results were achieved
134- with specific inputs and settings using specifically-versioned tools. Noteworthy,
135- this includes Y-junction on a graph which is where multiple input sources are
136- combined to generate new results.
123+ with specific inputs and settings using specifically-versioned tools including
124+ Y-junction on the workflow graph where multiple input sources are combined.
125+
126+ Concepts that are used in multiple tools are inherited from the following
127+ tool-agnostic application definitions:
137128
138- Defining, documenting, using, and sharing application definitions is a useful and future-proof
139- strategy for software development and data analyses as it enables automated provenance
140- tracking working silently in the background.
129+ :ref: `NXapm_paraprobe_tool_config `, :ref: `NXapm_paraprobe_tool_results `:
130+ Configuration and results respectively.
141131
142- In summary, the following application definitions were defined for the paraprobe-toolbox.
143- These are always pairs of application definitions --- one for the configuration (input) side
144- and one for the results (output) side. For each tool one such pair is proposed:
132+ Internally, these inherit from several other tool-agnostic base classes
133+ adding atom-probe-research-specific concepts:
134+
135+ :ref: `NXapm_paraprobe_tool_parameters `, :ref: `NXapm_paraprobe_tool_process `, :ref: `NXapm_paraprobe_tool_common `:
136+ Parameters, processing specific data, and common parts respectively useful for the application definitions of the tools of the paraprobe-toolbox.
145137
146138.. _CC-Apm-Paraprobe-Application-Definitions :
147139
148140Application Definitions
149141#######################
150142
143+ In summary, these are the proposed pairs for all tools in the paraprobe-toolbox:
144+
151145:ref: `NXapm_paraprobe_ranger_config `, :ref: `NXapm_paraprobe_ranger_results `
152146 Configuration and results respectively of the paraprobe-ranger tool.
153147 Apply ranging definitions and explore possible molecular ions.
154148 Store applied ranging definitions and combinatorial analyses of possible iontypes.
155149
156150:ref: `NXapm_paraprobe_surfacer_config `, :ref: `NXapm_paraprobe_surfacer_results `
157151 Configuration and results respectively of the paraprobe-surfacer tool.
158- Create a model for the edge of a point cloud via convex hulls, alpha shapes, or alpha-wrappings.
159- Store triangulated surface meshes of models for the edge of a dataset.
152+ Create a model for the edge of a point cloud via convex hulls, alpha shapes,
153+ or alpha-wrappings. Store triangulated surface meshes of models
154+ for the edge of a dataset.
160155
161156:ref: `NXapm_paraprobe_distancer_config `, :ref: `NXapm_paraprobe_distancer_results `
162157 Configuration and results respectively of the paraprobe-distancer tool.
@@ -177,17 +172,19 @@ Application Definitions
177172
178173:ref: `NXapm_paraprobe_nanochem_config `, :ref: `NXapm_paraprobe_nanochem_results `
179174 Configuration and results respectively of the paraprobe-nanochem tool.
180- Compute delocalization, iso-surfaces, analyze 3D objects, composition profiles, and mesh interfaces.
175+ Compute delocalization, iso-surfaces, analyze 3D objects, composition profiles,
176+ and mesh interfaces.
181177
182178:ref: `NXapm_paraprobe_clusterer_config `, :ref: `NXapm_paraprobe_clusterer_results `
183179 Configuration and results respectively of the paraprobe-clusterer tool.
184180 Compute cluster analyses with established machine learning algorithms using CPU or GPUs.
185181
186182:ref: `NXapm_paraprobe_intersector_config `, :ref: `NXapm_paraprobe_intersector_results `
187183 Configuration and results resepctively of the paraprobe-intersector tool.
188- Analyze volumetric intersections and proximity of 3D objects discretized as triangulated surface meshes
189- in continuum space to study the effect the parameterization of surface extraction algorithms on the
190- resulting shape, spatial arrangement, and colocation of 3D objects via graph-based techniques.
184+ Analyze volumetric intersections and proximity of 3D objects discretized as
185+ triangulated surface meshes in continuum space to study the effect the
186+ parameterization of surface extraction algorithms on the resulting shape,
187+ spatial arrangement, and colocation of 3D objects via graph-based techniques.
191188
192189.. _CC-Apm-Paraprobe-German-NFDI :
193190
@@ -197,8 +194,8 @@ Joint work German NFDI consortia NFDI-MatWerk and FAIRmat
197194Members of the FAIRmat and the NFDI-MatWerk consortia of the German National Research Data Infrastructure
198195are working together within the Infrastructure Use Case IUC09 of the NFDI-MatWerk project to work on examples
199196how software tools in both consortia become better documented and interoperable to use. Within this project,
200- we have also added the `CompositionSpace tool <https://github.com/eisenforschung/CompositionSpace >`_ by A. Saxena et al. that has been developed at the
201- Max Planck Institute for Sustainable Materials in Düsseldorf
197+ we added the `CompositionSpace tool <https://github.com/eisenforschung/CompositionSpace >`_ by A. Saxena et al. that has been developed at the
198+ Max Planck Institute for Sustainable Materials in Düsseldorf using the above-mentioned approach of pairs of application definitions:
202199
203200:ref: `NXapm_compositionspace_config `, :ref: `NXapm_compositionspace_results `
204201 Results of a run with Alaukik Saxena's composition space tool.
0 commit comments