Skip to content

Commit 555e38e

Browse files
Merge pull request #402 from FAIRmat-NFDI/nexus_validation_tutorial
seperated validation and creation tutorial, fixed several typos and languange restuctured links and headings for TOC sidebar
2 parents 23927ac + af3dd95 commit 555e38e

14 files changed

+1296
-9
lines changed
78.8 KB
Loading
66.6 KB
Loading
125 KB
Loading
120 KB
Loading
167 KB
Loading
157 KB
Loading
Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
# Use python to create NeXus files
2+
3+
__The goal__
4+
5+
Use python to create a NeXus file (.nxs) by hardcoding via the python package `h5py`. NeXus files can also be created by our software [`pynxtools`](https://github.com/FAIRmat-NFDI/pynxtools) automatically, but ONLY IF a reader for the specific device/instrument/data-structure exists. This How-To is intended as easy access to FAIR data structures _via_ NeXus. For static-datastructures (i.e., always the same type of standard measurement) or one-time examples (small data publications), this may provide a feasible solution. For large scaled automated file processing, storage, and validation, it is advisable to use [`pynxtools`](https://github.com/FAIRmat-NFDI/pynxtools) and its measurement method specific [plugins](../reference/plugins.md)
6+
7+
You can find the necessary file downloads [here](https://zenodo.org/records/13373909).
8+
9+
10+
11+
## Make NeXus file by python
12+
13+
Install `h5py` via `pip`:
14+
```console
15+
`pip install h5py`
16+
```
17+
18+
Then you can create a NeXus file by the python script called [h5py_nexus_file_creation.py](https://zenodo.org/records/13373909/files/h5py_nexus_file_creation.py?download=1).
19+
20+
```
21+
# Import h5py, to write an hdf5 file
22+
import h5py
23+
24+
# create a h5py file in writing mode with given name "NXopt_minimal_example", file extension "nxs"
25+
f = h5py.File("NXopt_minimal_example.nxs", "w")
26+
27+
# there are only 3 fundamental objects: >group<, >attribute< and >datafield<.
28+
29+
30+
# create a >group< called "entry"
31+
f.create_group('/entry')
32+
33+
# assign the >group< called "entry" an >attribute<
34+
# The attribute is "NX_class"(a NeXus class) with the value of this class is "NXentry"
35+
f['/entry'].attrs['NX_class'] = 'NXentry'
36+
37+
# create >datafield< called "definition" inside the entry, and assign it the value "NXoptical_spectroscopy"
38+
# This field is important, as it is used in validation process to identify the NeXus definition.
39+
f['/entry/definition'] = 'NXoptical_spectroscopy'
40+
```
41+
42+
This proves a starting point of the NeXus file. We will go through these functions in the following.
43+
44+
45+
46+
## Add NeXus concepts by python
47+
48+
Go to [FAIRmat NeXus definitions](<https://fairmat-nfdi.github.io/nexus_definitions/index.html#>)
49+
50+
Scroll down until you see the search box named "Quick search".
51+
52+
Type "NXoptical" and press start the search.
53+
54+
You see several search results, select the one with is named "NXoptical\_spectroscopy".
55+
56+
Then you are (ideally) on this page: [NXoptical_spectroscopy NeXus definition](<https://fairmat-nfdi.github.io/nexus_definitions/classes/contributed_definitions/NXoptical_spectroscopy.html>)
57+
58+
You see a tree-like structure of the NeXus definition NXoptical\_spectrosocopy with several tree nodes: Status, Description, Symbols, Groups\_cited, Structure. For now, only the part in Structure is of interest. This contains the information which has to be written in the python code to add fields/groups/attributes to the NeXus file.
59+
60+
Use your browser search (CRTL+F) and search for "required". Ideally, your browser highlights all concepts which are required. You have to add those to the python script to extend your created .nxs file. (Which fields/groups/attributes are "required" was defined by the respective scientific community, to ensure that the data serves the FAIR principles.)
61+
62+
In the following, it will be shown how the python script has to be extended for the three fundamental objects:
63+
64+
1. Attribute
65+
66+
2. Datafield
67+
68+
3. Group
69+
70+
71+
72+
73+
74+
### Adding an attribute
75+
76+
Search for the first concept/object in the NeXus file which is not created yet. It is:
77+
78+
**@version**: (required) [NX\_CHAR](<https://fairmat-nfdi.github.io/nexus_definitions/nxdl-types.html#nx-char>) [](<https://fairmat-nfdi.github.io/nexus_definitions/classes/base_classes/NXentry.html#nxentry-definition-version-attribute>)
79+
80+
1. It is located in the tree at position: ENTRY/definition/
81+
82+
2. The "@" indicates that this is an attribute of the concept "definition".
83+
84+
3. The name of the attribute is "version".
85+
86+
4. Since it is "required", that means this attribute has to be added so that the resulting NeXus file is compliant with the NeXus definition "NXoptical\_spectroscopy".
87+
88+
5. The "NX\_CHAR" indicates the datatype. This should be a string: "The preferred string representation is UTF-8" (more information see [here](<https://manual.nexusformat.org/nxdl-types.html>))
89+
90+
![image.png](<./attachments/51dc82f9f0f5ec2f-image.png>)
91+
92+
Now the python script has to be extended in the following:
93+
94+
```
95+
f['/entry/definition'].attrs['version'] = 'v2024.02'
96+
```
97+
98+
This h5py command adds the attribute named "version" with the value "v2024.02" to the HDF5 dataset called "/entry/definition". The same is done for the URL attribute:
99+
100+
```
101+
f['/entry/definition'].attrs['URL'] = 'https://github.com/FAIRmat-NFDI/nexus_definitions/blob/f75a29836431f35d68df6174e3868a0418523397/contributed_definitions/NXoptical_spectroscopy.nxdl.xml'
102+
```
103+
104+
For your use case, you may want to use a different version of the NeXus definitions, since these are changed over time. In the following, it is shown where to obtain the correct version and URL.
105+
106+
__Get the values: *version* and *URL*__
107+
108+
At the time, you create the NeXus definition. Go to the page of the respectively used NeXus concept, i.e. [NXoptical_spectroscopy](<https://fairmat-nfdi.github.io/nexus_definitions/classes/contributed_definitions/NXoptical_spectroscopy.html>)
109+
110+
Scroll down until you find "**NXDL Source**:" and follow this link, i.e. [NXoptical_spectroscopy.nxdl.xml](<https://github.com/FAIRmat-NFDI/nexus_definitions/blob/fairmat/contributed_definitions/NXoptical_spectroscopy.nxdl.xml>)
111+
112+
This is the GitHub website, in which the latest (FAIRmat) NeXus definition of NXoptical\_spectroscopy is stored in the NeXus definition language file (.nxdl). The information is structured in the xml format.
113+
114+
Now you have to copy the permalink of this file. Go to the top right side of the website. Find the Menu made by 3 dots:
115+
116+
![image.png](<./attachments/c6ab2f4b925aed27-image.png>)
117+
118+
Copy the permalink and insert it as value for the "URL" attribute (Step 1, Red box in the image)
119+
120+
Go to "nexus\_definitions" (Step 2, Red box in the image)
121+
122+
![image.png](<./attachments/d8e727b3b32dcbb9-image.png>)
123+
124+
On the right side, you should see below "Releases" the "tags" (Red box in the image). Follow this link.
125+
126+
Copy the latest tag, which should look similar to "v2024.02". Insert it as value for the "version" attribute.
127+
128+
__Disclaimer__
129+
When specifying this version tag, it would be better to include the "GitHub commit id" as well. In this way, a [pynxtools generated version tag](https://github.com/FAIRmat-NFDI/pynxtools/blob/c13716915bf8f69068c3b94d1423681b580fd437/src/pynxtools/_build_wrapper.py#L17) might look like this:
130+
`v2022.07.post1.dev1278+g1d7000f4`. If you have pynxtools installed, you can get the tag by:
131+
132+
```python
133+
>>> from pynxtools import get_nexus_version
134+
>>> get_nexus_version()
135+
'v2022.07.post1.dev1284+gf75a2983'
136+
```
137+
138+
139+
140+
### Adding a datafield
141+
142+
Two attributes were added to "ENTRY/definition", both of which were required. By now, this part of the NeXus file fulfills the requirements of the application definition NXoptical\_spectroscopy.
143+
144+
The next required concept of [NXoptical_spectrsocopy](https://fairmat-nfdi.github.io/nexus_definitions/classes/contributed_definitions/NXoptical_spectroscopy.html) is "**experiment\_type"**.
145+
146+
**experiment\_type**: (required) [NX\_CHAR](<https://fairmat-nfdi.github.io/nexus_definitions/nxdl-types.html#nx-char>)
147+
148+
1. It is located in the tree at position: ENTRY/
149+
150+
2. There is no "@" in front of "**experiment\_type"**. So, this may be a group or a datafield.
151+
152+
3. The name of this group/datafield is "**experiment\_type**".
153+
154+
4. The "required" indicates that this group/datafield has to be added to be in line with the NeXus definition "NXoptical\_spectroscopy".
155+
156+
5. The "NX\_CHAR" indicates the datatype. This should be a string: "The preferred string representation is UTF-8" (more information see [here](<https://manual.nexusformat.org/nxdl-types.html>)).
157+
158+
6. The "NX\_CHAR" indicates that this is a datafield. It is NOT a group.
159+
A group is a NeXus class. "NXentry" is for example a NeXus class, while "NX_CHAR" indicates the datatype of the field.
160+
Whether or not the underscore "_" is present after NX, indicates therefore if it is a NeXus class or datafield.
161+
162+
Read the documentation at "▶ Specify the type of the optical experiment. ..." by extending it via click on the triangle symbol. You should see something like this:
163+
164+
![image.png](<./attachments/5cbd8c6a1ca227df-image.png>)
165+
166+
There, the value of the datafield has to be one of the shown list, since it is an enumeration (e.g. "transmission spectroscopy"). Note that this is case sensitive.
167+
168+
Therefore, the python script has to be extended by:
169+
170+
```
171+
f['/entry/experiment_type'] = 'transmission spectroscopy'
172+
```
173+
174+
175+
176+
177+
178+
### Adding a group
179+
180+
The first required group in NXoptical\_spectroscopy on the "ENTRY/" level is "**INSTRUMENT**: (required) [NXinstrument](<https://fairmat-nfdi.github.io/nexus_definitions/classes/base_classes/NXinstrument.html#nxinstrument>) [⤆"](<https://fairmat-nfdi.github.io/nexus_definitions/classes/base_classes/NXentry.html#nxentry-instrument-group>)
181+
182+
1. It is located in the tree at position: NXentry/
183+
184+
2. There is no "@" in front of "**INSTRUMENT"** and because the "NXinstrument" is a NeXus class, this has to be implemented as group in the python script.
185+
186+
3. The "required" indicates that this group has to be added to be in line with the NeXus definition "NXoptical\_spectroscopy".
187+
188+
4. The "NXinstrument" indicates that it is a NeXus class (or group in python), as it starts with "NX" - without an underscore "_". It can also not be found at the [data types](https://manual.nexusformat.org/nxdl-types.html#data-types-allowed-in-nxdl-specifications).
189+
190+
5. As this is a group, attributes or values may be assigned to it.
191+
192+
6. As this is a group, it can contain many datafields or groups.
193+
194+
7. The uppercase notation of "**INSTRUMENT**" means:
195+
196+
1. You can give INSTRUMENT [almost](https://manual.nexusformat.org/datarules.html) any name, such as "abc" or "Raman\_setup" (see "regex" or regular expression).
197+
198+
2. You can create as many groups with the class NXinstrument as you want. Their names have to be different.
199+
200+
3. For more information see the [NeXus rules](../learn/nexus-rules.md)
201+
202+
The respective python code to implement a NXinstrument class (or equivalently in python group) with the name "experiment\_setup\_1" is:
203+
204+
```
205+
f.create_group('/entry/experiment_setup_1')
206+
f['/entry/experiment_setup_1'].attrs['NX_class'] = 'NXinstrument'
207+
```
208+
209+
The first line creates the group with the name "experiment\_setup\_1".
210+
211+
The second line assigns this group the attribute with the name "NX\_class" and its value "NXinstrument".
212+
213+
214+
215+
216+
217+
### Finishing the NeXus file
218+
219+
This has to be done by using the respective NeXus definition website:
220+
221+
[NXoptical_spectroscopy](<https://fairmat-nfdi.github.io/nexus_definitions/classes/contributed_definitions/NXoptical_spectroscopy.html>)
222+
223+
And by searching for all "required" entries. The next required entries are located inside the NXinstrument class:
224+
225+
1. **beam\_TYPE**: (required) [NXbeam](<https://fairmat-nfdi.github.io/nexus_definitions/classes/base_classes/NXbeam.html#nxbeam>) [](<https://fairmat-nfdi.github.io/nexus_definitions/classes/base_classes/NXinstrument.html#nxinstrument-beam-group>)
226+
227+
2. **detector\_TYPE**: (required) [NXdetector](<https://fairmat-nfdi.github.io/nexus_definitions/classes/base_classes/NXdetector.html#nxdetector>) [](<https://fairmat-nfdi.github.io/nexus_definitions/classes/base_classes/NXinstrument.html#nxinstrument-detector-group>)
228+
229+
Both are groups. "**beam\_TYPE"** could be named: "beam\_abc" or "beam\_Raman\_setup". Use the knowledge above to extend the python script to create those NeXus file entries.
230+
231+
__Note for required NeXus concepts__
232+
233+
Above in the definition of NXoptical\_spectroscopy, you as well may found a required entry "**depends\_on**: (required) [NX\_CHAR](<https://fairmat-nfdi.github.io/nexus_definitions/nxdl-types.html#nx-char>) [⤆"](<https://fairmat-nfdi.github.io/nexus_definitions/classes/contributed_definitions/NXcoordinate_system.html#nxcoordinate-system-depends-on-field>). This is at the level of "ENTRY/reference\_frames/beam\_ref\_frame". If you don't have the group "**beam\_ref\_frame"** because this is "optional", then you don't need to have this field.
234+
235+
236+
237+
[_Continue by validating the NeXus file_](validate-nexus-file.md)
238+
239+
## Feedback and contact
240+
241+
1. Best way is to contact the FAIRmat team directly by creating a [Github Issue](https://github.com/FAIRmat-NFDI/nexus_definitions/issues/new).
242+
243+
2. ron.hildebrandt(at)physik.hu-berlin.de
244+
245+
246+
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
2+
This lists some notes for installation of nxvalidate on Ubuntu and Windows. For windows, the installation of the XML2 library was not sucessful. This should be possible, but could not reproduced yet.
3+
4+
5+
6+
# cnxvalidate installation on Ubuntu 22.04
7+
8+
These commands install nxvaldiate on a fresh Ubuntu 22.04 system (tested with Linux running from USB stick).
9+
10+
```
11+
sudo apt-get update
12+
sudo apt-get install git
13+
sudo apt-get install build-essential
14+
sudo add-apt-repository universe
15+
sudo apt-get install libhdf5-serial-dev
16+
sudo apt-get -y install pkg-config
17+
sudo apt upgrade -y
18+
sudo apt-get -y install cmake
19+
sudo apt-get install libxml2-dev
20+
21+
mkdir nexusvalidate
22+
cd nexusvalidate
23+
git clone https://github.com/nexusformat/cnxvalidate.git
24+
cd cnxvalidate/
25+
mkdir build
26+
cd build/
27+
cmake ../
28+
make
29+
```
30+
31+
# cnxvalidate installation on windows:
32+
33+
## -- CMAKE
34+
35+
[https://cmake.org/download/](<https://cmake.org/download/>)
36+
37+
\--> [cmake-3.30.2-windows-x86\_64.msi](<https://github.com/Kitware/CMake/releases/download/v3.30.2/cmake-3.30.2-windows-x86_64.msi>)
38+
39+
Install with .msi
40+
41+
## -- HDF5
42+
43+
Download **hdf5-1.14.4-2-win-vs2022\_**[**cl.zip**](<http://cl.zip>)** from **[https://www.hdfgroup.org/downloads/hdf5/](<https://www.hdfgroup.org/downloads/hdf5/>)
44+
45+
unzip the .zip file
46+
47+
put the file into the folder
48+
49+
```
50+
C:\hdf5
51+
```
52+
53+
(can be named differently, but no spaces are allowed for this path)
54+
55+
```
56+
set PATH=%PATH%;C:\your\path\here\
57+
```
58+
59+
## -- libiconv
60+
61+
[https://github.com/vovythevov/libiconv-cmake](<https://github.com/vovythevov/libiconv-cmake>)
62+
63+
```
64+
git clone
65+
```
66+
67+
cd to downloaded directory
68+
69+
```
70+
mkdir build
71+
cd build
72+
cmake ..
73+
```
74+
75+
## -- XML2
76+
77+
??? Unsolved...
78+
79+
Please create GitHub issue [here](https://github.com/FAIRmat-NFDI/pynxtools/issues/new) if you could solve this.
80+

0 commit comments

Comments
 (0)