Skip to content

Commit 6dd3c7d

Browse files
Merge pull request #204 from jeromekelleher/cli-doc-updates
Cli doc updates
2 parents 9983e73 + 8106d40 commit 6dd3c7d

File tree

6 files changed

+84
-11
lines changed

6 files changed

+84
-11
lines changed

docs/installation.md

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,45 @@
1+
(sec-installation)=
12
# Installation
23

34

4-
```
5-
$ python3 -m pip install bio2zarr
5+
```bash
6+
python3 -m pip install bio2zarr
67
```
78

8-
This will install the programs ``vcf2zarr``, ``plink2zarr`` and ``vcf_partition``
9+
This will install the programs ``vcf2zarr`` and ``vcf_partition``
910
into your local Python path. You may need to update your $PATH to call the
1011
executables directly.
1112

1213
Alternatively, calling
13-
```
14-
$ python3 -m bio2zarr vcf2zarr <args>
14+
```bash
15+
python3 -m bio2zarr vcf2zarr <args>
1516
```
1617
is equivalent to
1718

18-
```
19-
$ vcf2zarr <args>
19+
```bash
20+
vcf2zarr <args>
2021
```
2122
and will always work.
2223

24+
:::{note}
25+
The ``python3 -m bio2zarr vcf2zarr`` for may be replaced with
26+
``python3 -m bio2zarr.vcf2zarr`` in the near future.
27+
See GitHub issue [203](https://github.com/sgkit-dev/bio2zarr/issues/203).
28+
:::
29+
30+
31+
:::{warning}
32+
Windows is not currently supported. Please comment on
33+
[this issue](https://github.com/sgkit-dev/bio2zarr/issues/174) if you would
34+
like to see Windows support for bio2zarr.
35+
:::
36+
2337

2438
## Shell completion
2539

2640
To enable shell completion for a particular session in Bash do:
2741

28-
```
42+
```bash
2943
eval "$(_VCF2ZARR_COMPLETE=bash_source vcf2zarr)"
3044
```
3145

docs/intro.md

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,27 @@
11
# bio2zarr
22

33
`bio2zarr` efficiently converts common bioinformatics formats to
4-
[Zarr](https://zarr.readthedocs.io/en/stable/) format. Initially supporting converting
5-
VCF to the [VCF Zarr specification](https://github.com/sgkit-dev/vcf-zarr-spec/).
4+
[Zarr](https://zarr.readthedocs.io/en/stable/) format.
5+
6+
## Tools
7+
8+
- {ref}`sec-vcf2zarr` converts VCF data to
9+
[VCF Zarr](https://github.com/sgkit-dev/vcf-zarr-spec/) format.
10+
11+
- {ref}`sec-vcfpartition` is a utility to split an input (set of)
12+
VCFs into a given number of partitions. This is useful for
13+
parallel processing.
14+
15+
## Development status
616

717
`bio2zarr` is in development, contributions, feedback and issues are welcome
818
at the [GitHub repository](https://github.com/sgkit-dev/bio2zarr).
919

20+
Support for converting PLINK data to VCF Zarr is partially implemented,
21+
and adding BGEN support is also planned. If you would like to see
22+
support for other formats (or an interested in helping with implementing),
23+
please open an [issue on Github](https://github.com/sgkit-dev/bio2zarr/issues)
24+
to discuss!
25+
26+
The package is currently focused on command line interfaces, but a
27+
Python API is also planned.

docs/vcf2zarr/cli_ref.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
(sec-vcf2zarr-cli-ref)=
12
# CLI Reference
23

34
% A note on cross references... There's some weird long-standing problem with
@@ -57,6 +58,7 @@
5758
## Encode
5859

5960
```{eval-rst}
61+
.. _cmd-vcf2zarr-encode:
6062
.. click:: bio2zarr.cli:encode
6163
:prog: vcf2zarr encode
6264
:nested: full

docs/vcf2zarr/overview.md

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,44 @@
1+
(sec-vcf2zarr)=
12
# vcf2zarr
23

4+
Convert VCF data to the
5+
[VCF Zarr specification](https://github.com/sgkit-dev/vcf-zarr-spec/)
6+
reliably, in parallel or distributed over a cluster.
37

4-
Convert a VCF to zarr format:
8+
See the {ref}`sec-vcf2zarr-tutorial` for a step-by-step introduction
9+
and the {ref}`sec-vcf2zarr-cli-ref` detailed documentation on
10+
command line options.
11+
12+
13+
## Quickstart
14+
15+
First {ref}`install bio2zarr<sec-installation>`.
16+
17+
18+
:::{note}
19+
FINISH ME
20+
:::
21+
22+
23+
## How does it work?
24+
The conversion of VCF data to Zarr is a two-step process:
25+
26+
1. Convert ({ref}`explode<cmd-vcf2zarr-explode>`) VCF file(s) to
27+
Intermediate Columnar Format (ICF)
28+
2. Convert ({ref}`encode<cmd-vcf2zarr-encode>`) ICF to Zarr
29+
30+
This two-step process allows `vcf2zarr` to determine the correct
31+
dimension of Zarr arrays corresponding to each VCF field, and
32+
to keep memory usage tightly bounded while writing the arrays.
33+
34+
:::{important}
35+
The intermediate columnar format is not intended for any use
36+
other than a temporary storage while converting VCF to Zarr.
37+
The format may change between versions of `bio2zarr`.
38+
:::
39+
40+
41+
## Common options
542

643
```
744
$ vcf2zarr convert <VCF1> <VCF2> <zarr>

docs/vcf2zarr/tutorial.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ kernelspec:
99
language: bash
1010
name: bash
1111
---
12+
(sec-vcf2zarr-tutorial)=
1213
# Tutorial
1314

1415
This is a step-by-step tutorial showing you how to convert your

docs/vcfpartition/overview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
(sec-vcfpartition)=
12
# vcfpartition
23

34
## Overview

0 commit comments

Comments
 (0)