Skip to content

Commit f55b3c3

Browse files
Update the intro a bit
1 parent 29fbdbb commit f55b3c3

File tree

2 files changed

+11
-97
lines changed

2 files changed

+11
-97
lines changed

docs/api.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
(sec_python_api)=
2+
13
# Python API
24

35
This page documents the public Python API exposed by ``sc2ts``.

docs/intro.md

Lines changed: 9 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -6,105 +6,17 @@ at pandemic scale.
66
It consists of:
77

88
1. A CLI-driven method to infer ARGs from SARS-CoV-2 data.
9-
2. A lightweight wrapper around the :mod:`tskit` Python APIs, specialised
9+
2. A lightweight wrapper around the {mod}`tskit` Python APIs, specialised
1010
for the output of sc2ts and enabling efficient node metadata access.
11-
3. A lightweight wrapper around :mod:`zarr` for convenient access to the
11+
3. A lightweight wrapper around [Zarr](https://zarr.dev) for convenient access to the
1212
Viridian dataset (alignments and metadata) in VCF Zarr format.
1313

14-
The underlying methods are described in the sc2ts pre-print:
15-
<https://www.biorxiv.org/content/10.1101/2023.06.08.544212v2>.
14+
The underlying methods are described in the sc2ts [preprint](
15+
<https://www.biorxiv.org/content/10.1101/2023.06.08.544212v2>).
1616

17-
Most users will run sc2ts via the command line interface,
18-
which drives inference and postprocessing steps (see the
19-
{ref}`CLI documentation <sc2ts_sec_cli>`). The Python API is intended for
20-
working with tree sequences and datasets produced by sc2ts (see the
21-
{ref}`Python API reference <api>`).
17+
Most users will use the {ref}`sec_python_api` to perform {ref}`sec_arg_analysis`
18+
on the sc2ts inferred ARG or {ref}`sec_alignments_analysis` on the
19+
Zarr-formatted Viridian dataset distributed on Zenodo.
2220

23-
For an overview and examples, see the project README and associated
24-
notebooks in the repository root.
25-
26-
## Installation
27-
28-
Install sc2ts from PyPI:
29-
30-
```sh
31-
python -m pip install sc2ts
32-
```
33-
34-
This installs the minimal requirements for the analysis and dataset APIs.
35-
To run inference from the command line, install the optional inference
36-
dependencies:
37-
38-
```sh
39-
python -m pip install 'sc2ts[inference]'
40-
```
41-
42-
## Quick start: ARG analysis
43-
44-
To compute summary dataframes for nodes and mutations in an inferred ARG,
45-
you can load an sc2ts tree sequence and call the analysis helpers. For
46-
example, download the sc2ts paper ARG from Zenodo:
47-
48-
```sh
49-
curl -O https://zenodo.org/records/17558489/files/sc2ts_viridian_v1.2.trees.tsz
50-
```
51-
52-
and then:
53-
54-
```python
55-
import sc2ts
56-
import tszip
57-
58-
ts = tszip.load("sc2ts_viridian_v1.2.trees.tsz")
59-
df_node = sc2ts.node_data(ts)
60-
df_mutation = sc2ts.mutation_data(ts)
61-
```
62-
63-
See the {ref}`Python API reference <api>` for full details of these
64-
functions.
65-
66-
## Quick start: CLI inference
67-
68-
To run inference locally using the example Viridian dataset and config:
69-
70-
1. Install the inference extras (if you have not already):
71-
72-
```sh
73-
python -m pip install 'sc2ts[inference]'
74-
```
75-
76-
2. Download the Viridian dataset in VCF Zarr format:
77-
78-
```sh
79-
curl -O https://zenodo.org/records/16314739/files/viridian_mafft_2024-10-14_v1.vcz.zip
80-
```
81-
82-
3. Run primary inference using the CLI and the example config in this repo:
83-
84-
```sh
85-
python -m sc2ts infer example_config.toml --stop=2020-02-02
86-
```
87-
88-
This will produce a series of `.ts` files and a match database in the
89-
output directory specified by the config (see the README for details).
90-
91-
4. Postprocess and generate an analysis-ready ARG:
92-
93-
```sh
94-
python -m sc2ts postprocess -vv \
95-
--match-db example_inference/ex1.matches.db \
96-
example_inference/ex1/ex1_2020-02-01.ts \
97-
example_inference/ex1_2020-02-01_pp.ts
98-
99-
python -m sc2ts minimise-metadata \
100-
-m strain sample_id \
101-
-m Viridian_pangolin pango \
102-
example_inference/ex1_2020-02-01_pp.ts \
103-
example_inference/ex1_2020-02-01_pp_mm.ts
104-
```
105-
106-
The file `example_inference/ex1_2020-02-01_pp_mm.ts` can then be used
107-
with the Python analysis APIs shown above.
108-
109-
See the {ref}`CLI documentation <sc2ts_sec_cli>` for a complete listing of
110-
subcommands and options.
21+
Uses who wish to perform {ref}`sec_inference` use the
22+
{ref}`sc2ts_sec_cli`.

0 commit comments

Comments
 (0)