You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paper/paper.md
+11-11Lines changed: 11 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -139,33 +139,33 @@ Achieving FAIR (Findable, Accessible, Interoperable, and Reproducible) data prin
139
139
140
140
# Dataconverter and validation
141
141
142
-
The __dataconverter__, core module of pynxtools, combines instrument output files and data from electronic lab notebooks into NeXus-compliant HDF5 files. The converter performs three key operations: reading experimental data through specialized readers, validating against NeXus application definitions to ensure compliance with existence, shape, and format constraints, and writing valid NeXus/HDF5 output files.
142
+
The _dataconverter_, core module of pynxtools, combines instrument output files and data from electronic lab notebooks into NeXus-compliant HDF5 files. The converter performs three key operations: reading experimental data through specialized readers, validating against NeXus application definitions to ensure compliance with existence, shape, and format constraints, and writing valid NeXus/HDF5 output files.
143
143
144
-
The __dataconverter__ provides a CLI to produce NeXus files where users can use one of the built-in readers for generic functionality or technique-specific reader plugins, distributed as separate Python packages.
144
+
The _dataconverter_ provides a CLI to produce NeXus files where users can use one of the built-in readers for generic functionality or technique-specific reader plugins, distributed as separate Python packages.
145
145
146
-
For developers, the __dataconverter__ provides an abstract __reader__ class for building plugins that process experiment-specific formats and populate the NeXus specification. It passes a __Template__, a subclass of Python’s dict, to the __reader__ as a form to fill. The __Template__ ensures structural compliance with the chosen NeXus application definition and organizes data by NeXus's required, recommended, and optional levels.
146
+
For developers, the _dataconverter_ provides an abstract _reader_ class for building plugins that process experiment-specific formats and populate the NeXus specification. It passes a _Template_, a subclass of Python’s dict, to the _reader_ as a form to fill. The _Template_ ensures structural compliance with the chosen NeXus application definition and organizes data by NeXus's required, recommended, and optional levels.
147
147
148
-
The __dataconverter__ validates __reader__ output against the selected NeXus application definition, checking required fields, complex dependencies (like inheritance and nested group rules), and data integrity (type, shape, constraints). It reports errors for invalid required fields and emits CLI warnings for unmatched or invalid data, aiding practical NeXus file creation.
148
+
The _dataconverter_ validates _reader_ output against the selected NeXus application definition, checking required fields, complex dependencies (like inheritance and nested group rules), and data integrity (type, shape, constraints). It reports errors for invalid required fields and emits CLI warnings for unmatched or invalid data, aiding practical NeXus file creation.
149
149
150
150
All reader plugins are tested using the pynxtools.testing suite, which runs automatically via GitHub CI to ensure compatibility with the dataconverter, the NeXus specification, and integration across plugins.
151
151
152
-
The dataconverter includes an ELN generator that creates either a fillable YAML file or a NOMAD [@Scheidgen:2023] ELN schema based on a selected NeXus application definition.
152
+
The dataconverter includes an ELN generator that creates either a fillable `YAML` file or a `NOMAD`[@Scheidgen:2023] ELN schema based on a selected NeXus application definition.
153
153
154
154
# NeXus reader and annotator
155
155
156
-
__read_nexus__ enables semantic access to NeXus files by linking data items to NeXus concepts, allowing applications to locate relevant data without hardcoding file paths. It supports concept-based queries that return all data items associated with a specific NeXus Vocabulary term. Each data item is annotated by traversing its group path and resolving its corresponding NeXus concept, included inherited definitions.
156
+
_read_nexus_ enables semantic access to NeXus files by linking data items to NeXus concepts, allowing applications to locate relevant data without hardcoding file paths. It supports concept-based queries that return all data items associated with a specific NeXus Vocabulary term. Each data item is annotated by traversing its group path and resolving its corresponding NeXus concept, included inherited definitions.
157
157
158
158
Items not part of the NeXus schema are explicitly marked as such, aiding in validation and debugging. Targeted documentation of individual data items is supported through path-specific annotation. The tool also identifies and summarizes the file’s default plottable data based on the NXdata definition.
159
159
160
-
# NOMAD integration
160
+
# `NOMAD` integration
161
161
162
-
While pynxtools works as a standalone tool, it can also be integrated directly into Research Data Management Systems (RDMS). Out of the box, the package functions as a plugin within the NOMAD platform [@Scheidgen:2023; @Draxl:2019]. This enables data in the NeXus format to be integrated into NOMAD's metadata model, making it searchable and interoperable with other data from theory and experiment. The plugin consists of several key components (so called entry points):
162
+
While pynxtools works as a standalone tool, it can also be integrated directly into Research Data Management Systems (RDMS). Out of the box, the package functions as a plugin within the `NOMAD` platform [@Scheidgen:2023; @Draxl:2019]. This enables data in the NeXus format to be integrated into `NOMAD`'s metadata model, making it searchable and interoperable with other data from theory and experiment. The plugin consists of several key components (so called entry points):
163
163
164
-
pynxtools extends NOMAD's data schema (called __Metainfo__[@Ghiringhelli:2017]) by integrating NeXus definitions as a NOMAD Schema Package, adding NeXus-specific quantities and enabling interoperability through links to other standardized data representations in NOMAD. The __dataconverter__ is integrated into NOMAD, making the conversion of data to NeXus accessible via the NOMAD GUI. The __dataconverter__ also processes manually entered NOMAD ELN data in the conversion.
164
+
pynxtools extends `NOMAD`'s data schema (called _Metainfo_[@Ghiringhelli:2017]) by integrating NeXus definitions as a `NOMAD` Schema Package, adding NeXus-specific quantities and enabling interoperability through links to other standardized data representations in `NOMAD`. The _dataconverter_ is integrated into `NOMAD`, making the conversion of data to NeXus accessible via the `NOMAD` GUI. The _dataconverter_ also processes manually entered `NOMAD` ELN data in the conversion.
165
165
166
-
The NOMAD Parser module in pynxtools (__NexusParser__) extracts structured data from NeXus HDF5 files to populate NOMAD with __Metainfo__ object instances as defined by the pynxtools schema package. This enables ingestion of NeXus data directly into NOMAD. Parsed data is post-processed using NOMAD's Normalization pipeline. This includes automatic handling of units, linking references (including sample and instrument identifiers defined elsewhere in NOMAD), and populating derived quantities needed for advanced search and visualization.
166
+
The `NOMAD` Parser module in pynxtools (_NexusParser_) extracts structured data from NeXus HDF5 files to populate `NOMAD` with _Metainfo_ object instances as defined by the pynxtools schema package. This enables ingestion of NeXus data directly into `NOMAD`. Parsed data is post-processed using `NOMAD`'s Normalization pipeline. This includes automatic handling of units, linking references (including sample and instrument identifiers defined elsewhere in `NOMAD`), and populating derived quantities needed for advanced search and visualization.
167
167
168
-
pynxtools contains an integrated Search Application for NeXus data within NOMAD, powered by Elasticsearch [@elasticsearch:2025]. This provides a search dashboard whereby users can efficiently filter uploaded data based on parameters like experiment type, upload timestamp, and domain- and technique-specific quantities. The entire pynxtools workflow (conversion, parsing, and normalization) is exemplified in a representative NOMAD Example Upload that is shipped with the package. This example helps new users understand the workflow and serves as a template to adapt the plugin to new NeXus applications.
168
+
pynxtools contains an integrated Search Application for NeXus data within `NOMAD`, powered by `Elasticsearch`[@elasticsearch:2025]. This provides a search dashboard whereby users can efficiently filter uploaded data based on parameters like experiment type, upload timestamp, and domain- and technique-specific quantities. The entire pynxtools workflow (conversion, parsing, and normalization) is exemplified in a representative `NOMAD` Example Upload that is shipped with the package. This example helps new users understand the workflow and serves as a template to adapt the plugin to new NeXus applications.
169
169
170
170
# Funding
171
171
The work is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 460197019 (FAIRmat).
0 commit comments