Docs rejig

jeromekelleher · jeromekelleher · commit 65efc3e65ac6 · 2024-05-14T09:45:55.000+01:00
diff --git a/docs/_toc.yml b/docs/_toc.yml
@@ -1,5 +1,6 @@
 format: jb-book
 root: intro
 chapters:
-- file: vcf2zarr_tutorial
+- file: installation
+- file: vcf2zarr
 - file: cli
diff --git a/docs/installation.md b/docs/installation.md
@@ -0,0 +1,22 @@
+# Installation
+
+
+```
+$ python3 -m pip install bio2zarr
+```
+
+This will install the programs ``vcf2zarr``, ``plink2zarr`` and ``vcf_partition``
+into your local Python path. You may need to update your $PATH to call the
+executables directly.
+
+Alternatively, calling
+```
+$ python3 -m bio2zarr vcf2zarr <args>
+```
+is equivalent to
+
+```
+$ vcf2zarr <args>
+```
+and will always work.
+
diff --git a/docs/intro.md b/docs/intro.md
@@ -1,76 +1,9 @@
-# bio2zarr Documentation
+# bio2zarr
 
-`bio2zarr` efficiently converts common bioinformatics formats to 
-[Zarr](https://zarr.readthedocs.io/en/stable/) format. Initially supporting converting 
-VCF to the [sgkit vcf-zarr specification](https://github.com/pystatgen/vcf-zarr-spec/).
+`bio2zarr` efficiently converts common bioinformatics formats to
+[Zarr](https://zarr.readthedocs.io/en/stable/) format. Initially supporting converting
+VCF to the [VCF Zarr specification](https://github.com/sgkit-dev/vcf-zarr-spec/).
 
-`bio2zarr` is in early alpha development, contributions, feedback and issues are welcome
+`bio2zarr` is in development, contributions, feedback and issues are welcome
 at the [GitHub repository](https://github.com/sgkit-dev/bio2zarr).
 
-## Installation
-`bio2zarr` can be installed from PyPI using pip:
-
-```bash
-$ python3 -m pip install bio2zarr
-```
-
-This will install the programs ``vcf2zarr``, ``plink2zarr`` and ``vcf_partition``
-into your local Python path. You may need to update your $PATH to call the 
-executables directly.
-
-Alternatively, calling 
-```
-$ python3 -m bio2zarr vcf2zarr <args>
-```
-is equivalent to 
-
-```
-$ vcf2zarr <args>
-```
-and will always work.
-
-## Basic vcf2zarr usage
-For modest VCF files (up to a few GB), a single command can be used to convert a VCF file
-(or set of VCF files) using the {ref}`convert<cmd-vcf2zarr-convert>` command:
-
-```bash
-$ vcf2zarr convert <VCF1> <VCF2> ... <VCFN> <zarr>
-```
-
-For larger files a multi-step process is recommended. 
-
-
-First, convert the VCF into the intermediate format:
-
-```bash
-$ vcf2zarr explode tests/data/vcf/sample.vcf.gz tmp/sample.exploded
-```
-
-Then, (optionally) inspect this representation to get a feel for your dataset
-```bash
-$ vcf2zarr inspect tmp/sample.exploded
-```
-
-Then, (optionally) generate a conversion schema to describe the corresponding
-Zarr arrays:
-
-```bash
-$ vcf2zarr mkschema tmp/sample.exploded > sample.schema.json
-```
-
-View and edit the schema, deleting any columns you don't want, or tweaking 
-dtypes and compression settings to your taste.
-
-Finally, encode to Zarr:
-```bash
-$ vcf2zarr encode tmp/sample.exploded tmp/sample.zarr -s sample.schema.json
-```
-
-Use the ``-p, --worker-processes`` argument to control the number of workers used
-in the ``explode`` and ``encode`` phases.
-
-
-
-
-```{tableofcontents}
-```
diff --git a/docs/vcf2zarr.md b/docs/vcf2zarr.md
@@ -9,15 +9,19 @@ kernelspec:
   language: bash
   name: bash
 ---
-# Vcf2zarr tutorial
+# vcf2zarr
+
+
+
+## Tutorial
 
 This is a step-by-step tutorial showing you how to convert your
 VCF data into Zarr format. There's three different ways to
 convert your data, basically providing different levels of
 convenience and flexibility corresponding to what you might
 need for small, intermediate and large datasets.
 
-## Small
+### Small
 
 <!-- ```{code-cell} bash -->
 <!-- vcf2zarr convert ../tests/data/vcf/sample.vcf.gz sample.zarr -vf -->
@@ -32,6 +36,6 @@ need for small, intermediate and large datasets.
  });
  </script>
 
-## Intermediate
+### Intermediate
 
-## Large
+### Large