You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+16-6Lines changed: 16 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,16 +40,17 @@ pip install .
40
40
41
41
### Usage
42
42
43
-
Currently, you can use the `extract` function from the `beam` module inside your own Python code:
43
+
#### As a Python module
44
+
To extract data from a file, you can use the `extract` function from the `beam` module inside your own Python code:
44
45
45
46
```python
46
-
frombeamimport extract
47
+
fromdatatractor_beamimport extract
47
48
48
49
# extract(<input_type>, <input_path>)
49
50
data = extract("./example.mpr", "biologic-mpr")
50
51
```
51
52
52
-
This example will install the first compatible `biologic-mpr`extractor it finds in the registryinto a fresh virtualenv (under `./beam-venvs`), and then execute it on the file at `example.mpr`.
53
+
This example will install the first extractor that is compatible with the `biologic-mpr`filetype that it finds in the registry. It will be installed into a fresh virtualenv (under `./beam-venvs`), and then executed on the file at `example.mpr`.
53
54
54
55
By default, the `extract` function will attempt to use the extractor's Python-based invocation (i.e. the optional `preferred_mode="python"` argument is specified). This means the extractor will be executed from within python, and the returned `data` object will be a Python object as defined (and supported) by the extractor. This may require additional packages to be installed, for examples `pandas` or `xarray`, which are both supported via the installation command `pip install .[formats]` above. If you encounter the following traceback, a missing "format" (such as `xarray` here) is the likely reason:
55
56
@@ -63,16 +64,25 @@ ModuleNotFoundError: No module named 'xarray'
63
64
Alternatively, if the `preferred_mode="cli"` argument is specified, the extractor will be executed using its command-line invocation. This means the output of the extractor will most likely be a file, which can be further specified using the `output_type` argument:
64
65
65
66
```python
66
-
frombeamimport extract
67
+
fromdatatractor_beamimport extract
67
68
ret = extract("example.mpr", "biologic-mpr", output_path="output.nc", preferred_mode="cli")
68
69
```
69
70
70
71
In this case, the `ret` will be empty bytes, and the output of the extractor should appear in the `output.nc` file.
71
72
72
-
Finally, `beam` can also be executed from the command line, implying `preferred_mode="cli"`. The command line invocation equivalent to the above Python syntax is:
73
+
#### As a command line utility
74
+
75
+
The `datatractor` utility supports the following subcommands:
76
+
77
+
-`beam`: used to extract data from an input file of a known file type,
78
+
-`probe`: used to search the registry for extractors that match a known file type,
79
+
-`yard`: used to fetch the definition of an extractor from the registry, and
80
+
-`install`: used to install an extractor.
81
+
82
+
In particular, the `extract()` functionality discussed above can also be executed from the command line, implying `preferred_mode="cli"`. The command line invocation equivalent to the above Python syntax is:
0 commit comments