Skip to content

Commit 5fd863a

Browse files
committed
Merge branch 'master' into fix_clickopt_configfile_default
2 parents 4bd3539 + 2d678b5 commit 5fd863a

File tree

25 files changed

+1389
-123
lines changed

25 files changed

+1389
-123
lines changed

.github/workflows/build_docs.yml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,17 @@ env:
1010

1111
permissions:
1212
contents: write
13+
1314
jobs:
1415
deploy:
1516
runs-on: ubuntu-latest
1617

1718
steps:
1819
- name: Checkout Repository
1920
uses: actions/checkout@v4
21+
with:
22+
fetch-depth: 0
23+
submodules: recursive
2024

2125
- name: Configure Git Credentials
2226
run: |
@@ -28,7 +32,7 @@ jobs:
2832
with:
2933
python-version: '3.x'
3034

31-
- name: Cache mkdocs-material enviroment
35+
- name: Cache mkdocs-material environment
3236
uses: actions/cache@v3
3337
with:
3438
key: mkdocs-material-${{ env.cache_id }}
@@ -39,7 +43,7 @@ jobs:
3943
- name: Install Dependencies
4044
run: |
4145
curl -LsSf https://astral.sh/uv/install.sh | sh
42-
uv pip install ".[docs]"
46+
uv pip install --no-cache-dir ".[docs]"
4347
4448
- name: Build and Deploy
4549
run: |

docs/how-tos/build-a-plugin.md

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
# Build your own pynxtools plugin
2+
3+
The pynxtools [dataconverter](https://github.com/FAIRmat-NFDI/pynxtools/tree/master/src/pynxtools/dataconverter) is used to convert experimental data to NeXus/HDF5 files based on any provided [NXDL schemas](https://manual.nexusformat.org/nxdl.html#index-1). The converter allows extending support to other data formats by allowing extensions called `readers`. There exist a set of [built-in pynxtools readers](https://github.com/FAIRmat-NFDI/pynxtools/tree/master/src/pynxtools/dataconverter/readers) as well as [pynxtools plugins](../reference/plugins.md) to convert supported data files for some experimental techniques into compliant NeXus files.
4+
5+
Your current data is not supported yet by the built-in pynxtools readers or the officially supported pynxtools plugins?
6+
7+
Don't worry, the following how-to will guide you through the steps of writing a reader for your own data.
8+
9+
10+
## Getting started
11+
12+
You should start by creating a clean repository that implements the following structure (for a plugin called ```pynxtools-plugin```):
13+
```
14+
pynxtools-plugin
15+
├── .github/workflows
16+
├── docs
17+
│ ├── explanation
18+
│ ├── how-tos
19+
│ ├── reference
20+
│ ├── tutorial
21+
├── src
22+
│ ├── pynxtools_plugin
23+
│ ├── reader.py
24+
├── tests
25+
│ └── data
26+
├── LICENSE
27+
├── mkdocs.yaml
28+
├── dev-requirements.txt
29+
└── pyproject.toml
30+
```
31+
32+
To identify `pynxtools-plugin` as a plugin for pynxtools, an entry point must be established (in the `pyproject.toml` file):
33+
```
34+
[project.entry-points."pynxtools.reader"]
35+
mydatareader = "pynxtools_plugin.reader:MyDataReader"
36+
```
37+
38+
Note that it is also possible that your plugin contains multiple readers. In that case, each reader must have its unique entry point.
39+
40+
Here, we will focus mostly on the `reader.py` file and how to build a reader. For guidelines on how to build the other parts of your plugin, you can have a look here:
41+
42+
- [Documentation writing guide](https://nomad-lab.eu/prod/v1/staging/docs/writing_guide.html)
43+
- [Plugin testing framework](using-pynxtools-test-framework.md)
44+
45+
<!-- Note: There is also a [cookiecutter template](https://github.com/FAIRmat-NFDI/pynxtools-plugin-template) available for creating your own pynxtools plugin, but this is currently not well-maintained.-->
46+
47+
48+
## Writing a Reader
49+
50+
After you have established the main structure, you can start writing your reader. The new reader shall be placed in `reader.py`.
51+
52+
Then implement the reader function:
53+
54+
```python title="reader.py"
55+
"""MyDataReader implementation for the DataConverter to convert mydata to NeXus."""
56+
from typing import Tuple, Any
57+
58+
from pynxtools.dataconverter.readers.base.reader import BaseReader
59+
60+
class MyDataReader(BaseReader):
61+
"""MyDataReader implementation for the DataConverter to convert mydata to NeXus."""
62+
63+
supported_nxdls = [
64+
"NXmynxdl" # this needs to be changed during implementation.
65+
]
66+
67+
def read(
68+
self,
69+
template: dict = None,
70+
file_paths: Tuple[str] = None,
71+
objects: Tuple[Any] = None
72+
) -> dict:
73+
"""Reads data from given file and returns a filled template dictionary"""
74+
# Here, you must provide functionality to fill the the template, see below.
75+
# Example:
76+
# template["/entry/instrument/name"] = "my_instrument"
77+
78+
return template
79+
80+
81+
# This has to be set to allow the convert script to use this reader. Set it to "MyDataReader".
82+
READER = MyDataReader
83+
84+
```
85+
### The reader template dictionary
86+
87+
The read function takes a [`Template`](https://github.com/FAIRmat-NFDI/pynxtools/blob/master/src/pynxtools/dataconverter/template.py) dictionary, which is used to map from the measurement (meta)data to the concepts defined in the NeXus application definition. The template contains keys that match the concepts in the provided NXDL file.
88+
89+
The returned template dictionary should contain keys that exist in the template as defined below. The values of these keys have to be data objects to populate the output NeXus file.
90+
They can be lists, numpy arrays, numpy bytes, numpy floats, numpy ints, ... . Practically you can pass any value that can be handled by the `h5py` package.
91+
92+
Example for a template entry:
93+
94+
```json
95+
{
96+
"/entry/instrument/source/type": "None"
97+
}
98+
```
99+
100+
For a given NXDL schema, you can generate an empty template with the command
101+
```console
102+
user@box:~$ dataconverter generate-template --nxdl NXmynxdl
103+
```
104+
105+
#### Naming of groups
106+
In case the NXDL does not define a `name` for the group the requested data belongs to, the template dictionary will list it as `/NAME_IN_NXDL[name_in_output_nexus]`. You can choose any name you prefer instead of the suggested `name_in_output_nexus` (see [here](../learn/nexus-rules.md) for the naming conventions). This allows the reader function to repeat groups defined in the NXDL to be outputted to the NeXus file.
107+
108+
```json
109+
{
110+
"/ENTRY[my_entry]/INSTRUMENT[my_instrument]/SOURCE[my_source]/type": "None"
111+
}
112+
```
113+
114+
#### Attributes
115+
For attributes defined in the NXDL, the reader template dictionary will have the assosciated key with a "@" prefix to the attributes name at the end of the path:
116+
117+
```json
118+
{
119+
"/entry/instrument/source/@attribute": "None"
120+
}
121+
```
122+
123+
#### Units
124+
If there is a field defined in the NXDL, the converter expects a filled in /data/@units entry in the template dictionary corresponding to the right /data field unless it is specified as NX_UNITLESS in the NXDL. Otherwise, a warning will be shown.
125+
126+
```json
127+
{
128+
"/ENTRY[my_entry]/INSTRUMENT[my_instrument]/SOURCE[my_source]/data": "None",
129+
"/ENTRY[my_entry]/INSTRUMENT[my_instrument]/SOURCE[my_source]/data/@units": "Should be set to a string value"
130+
}
131+
```
132+
133+
#### Links
134+
You can also define links by setting the value to sub dictionary object with key `link`:
135+
136+
```python
137+
template["/entry/instrument/source"] = {"link": "/path/to/source/data"}
138+
```
139+
140+
### Building off of the BaseReader
141+
When building off the [`BaseReader`](https://github.com/FAIRmat-NFDI/pynxtools/blob/master/src/pynxtools/dataconverter/readers/base/reader.py), the developer has the most flexibility. Any new reader must implement the `read` function, which must return a filled template object.
142+
143+
144+
### Building off of the MultiFormatReader
145+
While building on the ```BaseReader``` allows for the most flexibility, in most cases it is desirable to implement a reader that can read in multiple file formats and then populate the template based on the read data. For this purpose, `pynxtools` has the [**`MultiFormatReader`**](https://github.com/FAIRmat-NFDI/pynxtools/blob/master/src/pynxtools/dataconverter/readers/multi/reader.py), which can be readily extended for your own data.
146+
147+
You can find an extensive how-to guide to build off the `MultiFormatReader` [here](./use-multi-format-reader.md).
148+
149+
## Calling the reader from the command line
150+
151+
The dataconverter can be executed using:
152+
```console
153+
user@box:~$ dataconverter --reader mydatareader --nxdl NXmynxdl --output path_to_output.nxs
154+
```
155+
Here, the ``--reader`` flag must match the reader name defined in `[project.entry-points."pynxtools.reader"]` in the pyproject.toml file. The NXDL name passed to ``--nxdl``must be a valid NeXus NXDL/XML file in `pynxtools.definitions`.
156+
157+
Aside from this default structure, there are many more flags that can be passed to the
158+
dataconverter call. Here is its API:
159+
::: mkdocs-click
160+
:module: pynxtools.dataconverter.convert
161+
:command: convert_cli
162+
:prog_name: dataconverter
163+
:depth: 2
164+
:style: table
165+
:list_subcommands: True

docs/how-tos/media/mock_data.png

47.4 KB
Loading
49.6 KB
Loading

0 commit comments

Comments
 (0)