Skip to content

Commit 4e4aca8

Browse files
committed
Fixed PyPi rendering
1 parent fc459ca commit 4e4aca8

File tree

4 files changed

+528
-259
lines changed

4 files changed

+528
-259
lines changed

CITATION.cff

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,4 @@ license: BSD-2-Clause
2626
message: If you use this software, please cite it using these metadata.
2727
repository-code: https://github.com/FAIRDataPipeline/FAIR-CLI/
2828
title: "The FAIR Data Pipeline command line tool"
29-
version: 0.2.1
29+
version: 0.2.2

README.md

Lines changed: 25 additions & 257 deletions
Original file line numberDiff line numberDiff line change
@@ -1,100 +1,19 @@
11
# FAIR Data Pipeline Command Line Interface
22

3-
[![FAIR Data Pipeline CLI](https://github.com/FAIRDataPipeline/FAIR-CLI/actions/workflows/fair-cli.yaml/badge.svg?branch=dev)](https://github.com/FAIRDataPipeline/FAIR-CLI/actions/workflows/fair-cli.yaml)
4-
[![codecov](https://codecov.io/gh/FAIRDataPipeline/FAIR-CLI/branch/dev/graph/badge.svg?token=h93TkTiiWf)](https://codecov.io/gh/FAIRDataPipeline/FAIR-CLI)
5-
[![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=FAIRDataPipeline_FAIR-CLI&metric=alert_status)](https://sonarcloud.io/dashboard?id=FAIRDataPipeline_FAIR-CLI)
6-
7-
| **DISCLAIMER:** |
8-
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
9-
| The following document is largely conceptual and therefore does *not* represent a manual for the final interface. Statements within the following are likely to change, further details of possible changes are given throughout. Please either open an issue or pull request on the [source repository](https://github.com/FAIRDataPipeline/FAIR-CLI) raising any changes/issues. |
10-
11-
FAIR-CLI forms the main interface for synchronising changes between your local and shared remote FAIR Data Pipeline registries, it is also used to instantiate model runs/data submissions to the pipeline.
12-
13-
The project is still under development with many features still to be implemented and checked. Available commands are summarised below along with their usage.
3+
FAIR-CLI forms the main interface for synchronising changes between local and shared remote FAIR Data Pipeline registries, it is also used to instantiate model runs/data submissions to the pipeline. Full documentation of the FAIR Data Pipeline can be found on the project [website](https://www.fairdatapipeline.org/).
144

155
## Installation
166

17-
The project makes use of [Poetry](https://python-poetry.org/) for development which allows quick and easy mangement of dependencies, and provides a virtual environment exclusive to the project. Ultimately the project will be built into a pip installable module (using `poetry build`) meaning users will not need Poetry. You can access this environment by installing poetry:
18-
19-
```sh
20-
pip install poetry
21-
```
22-
23-
and, ensuring you are in the project repository, running:
24-
25-
```sh
26-
poetry install
27-
```
28-
29-
which will setup the virtual environment and install requirements. You can then either launch the environment as a shell using:
30-
31-
```sh
32-
poetry shell
33-
```
34-
35-
or run commands within it externally using:
7+
The package is installed using Pip:
368

379
```sh
38-
poetry run <command>
10+
pip install fair-cli
3911
```
4012

41-
## Structure
42-
43-
The layout of FAIR-CLI on a simplified system looks like this:
44-
45-
```sh
46-
$HOME
47-
├── .fair
48-
│ ├── cli
49-
│ │ ├── cli-config.yaml
50-
│ │ └── sessions
51-
│ ├── data
52-
│ │ └── jobs
53-
│ └── $REGISTRY_HOME
54-
55-
└─ Documents
56-
└─ my_project
57-
├── config.yaml
58-
└── .fair
59-
├── cli-config.yaml
60-
├── logs
61-
└── staging
62-
```
63-
64-
### Global and Local Directories
65-
66-
FAIR-CLI stores information for projects in two locations. The first is a *global* directory stored in the user's home folder in the same location as the registry itself `$HOME/.fair/cli`, and the second is a *local* directory which exists within the model project itself `$PROJECT_HOME/.fair`.
67-
68-
The CLI holds metadata for the user in it's own configuration file (not to be confused with the user modifiable `config.yaml`), `cli-config.yaml`, the *global* version of which is initialised during first use. In a manner similar to `git`, FAIR-CLI has repositories which allow the user to override these *global* configurations, this then forming a *local* variant.
69-
70-
### Data Directory
71-
72-
The directory `$HOME/.fair/data` is the default data store initialised by FAIR-CLI. During setup an alternative can be provided and this can be later changed on a per-run basis if the user so desires. The subdirectory `$HOME/data/jobs` contains timestamped directories of jobs.
73-
74-
### Sessions Directory
75-
76-
The directory `$HOME/.fair/sessions` is used to keep track of ongoing queries to the registry as a safety mechanism to ensure the registry is not shutdown whilst processes are still occuring.
77-
78-
### Logs Directory
79-
80-
The directory `$PROJECT/.fair/logs` stores `stdout` logs for jobs also giving information on who launched the job and how long it lasted.
81-
82-
### Staging File
83-
84-
The staging file, `$PROJECT/.fair/staging`, contains information of what jobs are being tracked, by default all jobs are added to this file after completion and are set to "unstaged". Simply contains a dictionary of booleans where items for sync (staged) are marked true `True` and those to be held only locally `False`. The file uses paths relative to the *local* `.fair` folder as keys, to behave in a manner identical to `git` staging.
85-
86-
### `config.yaml`
13+
## The User Configuration File
14+
Job runs are configured via `config.yaml` files. Upon initialisation of a project, FAIR-CLI automatically generates a starter configuration file with all requirements in place. To execute a process (e.g. perform a model run from a compiled binary/script) an additional key of either `script` or `script_path` must be provided. Alternatively the command `fair run bash` can be used to append the key and run a command directly.
8715

88-
This is the main file the user will interact with to customise their run. FAIR-CLI automatically generates a starter version of this file with everything in place. The only addition required is setting of either `script` or `script_path` (with the exception of running using `fair run bash` - see [below](#run)) under `run_metadata`.
89-
| |
90-
| ---------------------------------------------------------------------------------------------------------------------------------------------------- |
91-
| **`script`** |
92-
| This should be a command callable by a shell for running a model/submitting data to the registry. This script is saved to a file prior to execution. |
93-
| |
94-
| **`script_path`** |
95-
| This is a direct path to an existing script to use for submission. |
96-
97-
By default the shell used will be `sh` or `pwsh` for UNIX and Windows systems respectively, however this can be overwritten with the optional `shell` key which recognises the following values (where `{0}` is the script file):
16+
By default the shell used to execute a process is `sh` or `pwsh` for UNIX and Windows systems respectively. This can be overwritten by assigning the optional `shell` key with one of the following values (where `{0}` is the script file):
9817

9918
| **Shell** | **Command** |
10019
| ------------ | ------------------------------- |
@@ -109,90 +28,21 @@ By default the shell used will be `sh` or `pwsh` for UNIX and Windows systems re
10928
| `R` | `R -f {0}` |
11029
| `sh` | `sh -e {0}` |
11130

112-
| **NOTE** |
113-
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
114-
| This layout is subject to possible change depending on whether or not multiple aliases for the same user will be allowed in the registry itself. The main reason for having a *local* version is to support separate handling of multiple projects. |
115-
116-
## Registry Interaction
31+
A full description of `config.yaml` files can be found [here](https://www.fairdatapipeline.org/docs/interface/config/).
11732

118-
Currently `FAIR-CLI` sets up the write data storage location on the local registry if it does not exist. Entries are created for the YAML file type, current user as an author, and object for a given run.
11933

120-
## Command Line Usage
121-
122-
As mentioned, all of the subcommands within FAIR-CLI are still under review with many still serving as placeholders for future features. Running `fair` without arguments or `fair --help` will show all of these.
34+
## Available Commands
12335

12436
### `init`
12537

126-
Initialises a new FAIR repository within the given directory. This should ideally be the same location as the `.git` folder for the current project, although setup will ask if you want to use an alternative location. The command will ask the user a series of questions which will provide metadata for tracking run authors, and also allow for the creation of a starter `config.yaml`.
127-
128-
The first time this command is launched the *global* CLI configuration will be populated. In subsequent calls the *global* will provide default suggestions towards creating the CLI configuration for the repository (*local*).
38+
Initialises a new FAIR repository within the given directory. This should ideally be the same location as the `.git` folder for the current project, however during setup an option is given to specify an alternative. The command will ask the user a series of questions which will provide metadata for tracking run authors, and also allow for the creation of a starter `config.yaml` file. Initialisation will also configure the CLI itself.
12939

130-
A repository directory matching the structure above will be placed in the current location and a starter `config.yaml` file will be generated (see below).
131-
132-
#### Example: First call to `fair init`
133-
134-
This example shows the process of setting up for the first time. Note the default suggestions for each prompt, in the case of `Full name` and `Default output namespace` this is the hostname of the system and an abbreviated version of this name.
135-
136-
```sh
137-
$ fair init
138-
Initialising FAIR repository, setup will now ask for basic info:
139-
140-
Checking for local registry
141-
Local registry found
142-
Remote Data Storage Root [http://data.scrc.uk/data/]:
143-
Remote API Token File: $HOME/scrc_token.txt
144-
Local API URL [http://localhost:8000/api/]:
145-
Local registry is offline, would you like to start it? [y/N]: y
146-
Default Data Store: [/home/joebloggs/.fair/data]:
147-
Email: jbloggs@noreply.uk
148-
ORCID [None]:
149-
Full Name: Joe Bloggs
150-
Default input namespace [None]: SCRC
151-
Default output namespace [jbloggs]:
152-
Project description: Test project
153-
Local Git repository [/home/joebloggs/Documents/AnalysisProject]:
154-
Git remote name [origin]:
155-
Using git repository remote 'origin': git@notagit.com:jbloggs/AnalysisProject.git
156-
Initialised empty fair repository in /home/joebloggs/Documents/AnalysisProject/.fair
40+
#### Custom CLI Configuration
41+
After setup is complete, the current CLI configuration can also be saved using the command:
15742
```
158-
159-
#### Example: Subsequent runs
160-
161-
In subsequent runs the first time setup will provide further defaults.
162-
163-
```sh
164-
$ fair init
165-
Initialising FAIR repository, setup will now ask for basic info:
166-
167-
Project description: Test Project
168-
Local Git repository [/home/joebloggs/Documents/AnalysisProject]:
169-
Git remote name [origin]:
170-
Using git repository remote 'origin': git@nogit.com:joebloggs/AnalysisProject.git
171-
Remote API URL [http://data.scrc.uk/api/]:
172-
Remote API Token File [/home/kristian/scrc_token.txt]:
173-
Local API URL [http://localhost:8000/api/]:
174-
Default output namespace [jbloggs]:
175-
Default input namespace [SCRC]:
176-
Initialised empty fair repository in /home/joebloggs/Documents/AnalysisProject/.fair
177-
```
178-
179-
#### Generated `config.yaml`
180-
181-
```yaml
182-
run_metadata:
183-
default_input_namespace: SCRC
184-
default_output_namespace: jbloggs
185-
description: Test Project
186-
local_data_registry: http://localhost:8000/api/
187-
local_repo: /home/joebloggs/Documents/AnalysisProject
188-
write_data_store: /home/joebloggs/.fair/data/
43+
fair init --export
18944
```
190-
191-
the user then only needs to add a `script` or `script_path` entry to execute a code run. This is only required for `run`.
192-
193-
#### Advanced usage
194-
195-
CLI configuration can be read directly from a file which should contain the following:
45+
the created file can then be re-read at a later point during setup. Alternatively, if creating a configuration from scratch the YAML file should contain the following information:
19646

19747
```yaml
19848
namespaces:
@@ -218,24 +68,24 @@ git:
21868
remote: origin
21969
description: Testing Project
22070
```
221-
222-
this file is then read during initialisation:
71+
this file is then read during the initialisation:
22372
22473
```sh
22574
fair init --using <cli-config.yaml file>
22675
```
22776

228-
For the purposes of CI runs, the initialisation can be "skipped" by running:
77+
For integration into a CI workflow, the setup can be skipped by running:
22978

23079
```sh
23180
fair init --ci
23281
```
23382

234-
which will create temporary directories for some locations.
83+
which will create temporary directories for some of the required location paths.
84+
23585

23686
### `run`
23787

238-
The purpose of `run` is to execute a model/submission run to the local registry. The command fills any specified template variables of the form `${{ VAR }}` to match those outlined [below](#template-variables). Outputs of a run will be stored within the `coderun` folder in the directory specified under the `data_store` tag in the `config.yaml`, by default this is `$HOME/.fair/data/coderun`.
88+
The purpose of `run` is to execute a model/submission run and submit results to the local registry. Outputs of a run will be stored within the `coderun` folder in the directory specified under the `data_store` tag in the `config.yaml`, by default this is `$HOME/.fair/data/coderun`.
23989

24090
```sh
24191
fair run
@@ -247,7 +97,7 @@ If you wish to use an alternative `config.yaml` then specify it as an additional
24797
fair run /path/to/config.yaml
24898
```
24999

250-
You can also launch a bash command directly which will then be automatically written into the `config.yaml` for you:
100+
You can also launch a bash command directly, this will be automatically written into the `config.yaml`:
251101

252102
```sh
253103
fair run --script "echo \"Hello World\""
@@ -257,84 +107,23 @@ note the command itself must be quoted as it is a single argument.
257107

258108
### `pull`
259109

260-
Currently `pull` will update any entries within the `config.yaml` under the `register` heading creating `external_object` and `data_product` objects on the registry and downloading the data to the local data storage. For example:
261-
262-
```yaml
263-
run_metadata:
264-
default_input_namespace: SCRC
265-
default_output_namespace: jbloggs
266-
description: Test project
267-
local_data_registry: http://localhost:8000/api/
268-
local_repo: /home/joebloggs/Documents/SCRC/FAIR-CLI
269-
write_data_store: /home/joebloggs/.fair/data/
270-
register:
271-
- external_object: records/SARS-CoV-2/scotland/human-mortality
272-
namespace_name: Scottish Government Open Data Repository
273-
namespace_full_name: Scottish Government Open Data Repository
274-
namespace_website: https://statistics.gov.scot/
275-
root: https://statistics.gov.scot/sparql.csv?query=
276-
path: |-
277-
PREFIX qb: <http://purl.org/linked-data/cube#>
278-
PREFIX data: <http://statistics.gov.scot/data/>
279-
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
280-
PREFIX dim: <http://purl.org/linked-data/sdmx/2009/dimension#>
281-
PREFIX sdim: <http://statistics.gov.scot/def/dimension/>
282-
PREFIX stat: <http://statistics.data.gov.uk/def/statistical-entity#>
283-
PREFIX mp: <http://statistics.gov.scot/def/measure-properties/>
284-
SELECT ?featurecode ?featurename ?areatypename ?date ?cause ?location ?gender ?age ?type ?count
285-
WHERE {
286-
?indicator qb:dataSet data:deaths-involving-coronavirus-covid-19;
287-
mp:count ?count;
288-
qb:measureType ?measType;
289-
sdim:age ?value;
290-
sdim:causeOfDeath ?causeDeath;
291-
sdim:locationOfDeath ?locDeath;
292-
sdim:sex ?sex;
293-
dim:refArea ?featurecode;
294-
dim:refPeriod ?period.
295-
296-
?measType rdfs:label ?type.
297-
?value rdfs:label ?age.
298-
?causeDeath rdfs:label ?cause.
299-
?locDeath rdfs:label ?location.
300-
?sex rdfs:label ?gender.
301-
?featurecode stat:code ?areatype;
302-
rdfs:label ?featurename.
303-
?areatype rdfs:label ?areatypename.
304-
?period rdfs:label ?date.
305-
}
306-
title: Deaths involving COVID19
307-
description: Nice description of the dataset
308-
unique_name: Scottish deaths involving COVID19
309-
file_type: csv
310-
release_date: ${{DATETIME}}
311-
version: 0.${{DATE}}.0
312-
primary: True
313-
```
314-
315-
if run on `10/10/2021` would download the data from the given `root`/`path` URL and store in a file:
316-
317-
```sh
318-
/home/joebloggs/.fair/data/records/SARS-CoV-2/scotland/human-mortality/0.20211010.0.csv
319-
```
320-
321-
and register all required objects into the local registry.
110+
Currently `pull` will update any entries within the `config.yaml` under the `register` heading creating `external_object` and `data_product` objects on the registry and downloading the data to the local data storage. Any data required for a run is downloaded and stored within the local registry.
322111

323112
### `purge`
324113

325-
Removes the local `.fair` (FAIR repository) folder by default so the user can reinitialise:
114+
The `purge` command removes setup of the current project so it can bereinitialised:
326115

327116
```sh
328117
fair purge
329118
```
330119

331-
You can remove the global configuration and start again entirely by running:
120+
To remove all configurations entirely (including those global to all projects) run:
332121

333122
```sh
334123
fair purge --global
335124
```
336125

337-
and also the data directory by running:
126+
Finally the data directory itself can be removed by running:
338127

339128
```sh
340129
fair purge --data
@@ -386,7 +175,7 @@ Date: Wed Jun 30 09:09:30 2021
386175
387176
| **NOTE** |
388177
| ----------------------------------------------------------------------------------------------------------------------------------- |
389-
| The SHA for a job is *not* yet related to a registry code run identifier as multiple code runs can be executed within a single job. |
178+
| The SHA for a job is *not* related to a registry code run identifier as multiple code runs can be executed within a single job. |
390179
391180
### `view`
392181

@@ -396,28 +185,7 @@ To view the `stdout` of a run given its SHA as shown by running `fair log` use t
396185
fair view <sha>
397186
```
398187

399-
you do not need to specify the full SHA but rather the first few characters:
400-
401-
```text
402-
--------------------------------
403-
Commenced = Wed Jun 30 09:09:30 2021
404-
Author = Joe Bloggs <jbloggs@noreply.uk>
405-
Namespace = jbloggs
406-
Command = bash -eo pipefail /home/jbloggs/.fair/data/coderun/2021-06-30_09_09_30_721358/script.sh
407-
--------------------------------
408-
0
409-
1
410-
2
411-
3
412-
4
413-
5
414-
6
415-
7
416-
8
417-
9
418-
10
419-
------- time taken 0:00:00.011910 -------
420-
```
188+
you do not need to specify the full SHA but rather the first few unique characters.
421189

422190
## Template Variables
423191

0 commit comments

Comments
 (0)