Skip to content

Commit 29fbdbb

Browse files
Port remaining content out of readme, update URLs
1 parent 8ff1582 commit 29fbdbb

File tree

5 files changed

+105
-101
lines changed

5 files changed

+105
-101
lines changed

README.md

Lines changed: 4 additions & 101 deletions
Original file line numberDiff line numberDiff line change
@@ -11,105 +11,8 @@ access.
1111
convenient and efficient access to the full Viridian dataset (alignments and metadata)
1212
in a single file using the [VCF Zarr specification](https://doi.org/10.1093/gigascience/giaf049).
1313

14-
Please see the [preprint](https://www.biorxiv.org/content/10.1101/2023.06.08.544212v2)
15-
for details.
16-
17-
## Installation
18-
19-
Install sc2ts from PyPI:
20-
21-
```
22-
python -m pip install sc2ts
23-
```
24-
25-
This installs the minimum requirement to enable the
26-
[ARG analysis](#ARG-analysis-API) and [Dataset](#Dataset-API)s.
27-
To run [inference](#inference), you must install some extra
28-
dependencies using the 'inference' optional extra:
29-
30-
```
31-
python -m pip install sc2ts[inference]
32-
```
33-
34-
## ARG analysis API
35-
36-
The sc2ts API provides two convenience functions to compute summary
37-
dataframes for the nodes and mutations in a sc2ts-output ARG.
38-
39-
To see some examples, first download the (31MB) sc2ts inferred ARG
40-
from [Zenodo](https://zenodo.org/records/17558489/):
41-
42-
```
43-
curl -O https://zenodo.org/records/17558489/files/sc2ts_viridian_v1.2.trees.tsz
44-
```
45-
46-
We can then use these like
47-
48-
```python
49-
import sc2ts
50-
import tszip
51-
52-
ts = tszip.load("sc2ts_viridian_v1.2.trees.tsz")
53-
54-
df_node = sc2ts.node_data(ts)
55-
df_mutation = sc2ts.mutation_data(ts)
56-
```
57-
58-
See the [live demo](https://tskit.dev/explore/lab/index.html?path=sc2ts.ipynb)
59-
for a browser based interactive demo of using these dataframes for
60-
real-time pandemic-scale analysis.
61-
62-
## Dataset API
63-
64-
Sc2ts also provides a convenient API for accessing large-scale
65-
alignments and metadata stored in
66-
[VCF Zarr](https://doi.org/10.1093/gigascience/giaf049) format.
67-
68-
Resources:
69-
70-
- See this [notebook](https://github.com/jeromekelleher/sc2ts-paper/blob/main/notebooks/example_data_processing.ipynb)
71-
for an example in which we access the data variant-by-variant and
72-
which explains the low-level data encoding
73-
- See the [VCF Zarr publication](https://doi.org/10.1093/gigascience/giaf049)
74-
for more details on and benchmarks on this dataset.
75-
76-
77-
**TODO** Add some references to API documentation
78-
79-
80-
81-
## Development
82-
83-
To run the unit tests, use
84-
85-
```
86-
python3 -m pytest
87-
```
88-
89-
You may need to regenerate some cached test fixtures occasionaly (particularly
90-
if getting cryptic errors when running the test suite). To do this, run
91-
92-
```
93-
rm -fR tests/data/cache/
94-
```
95-
96-
and rerun tests as above.
97-
98-
### Debug utilities
99-
100-
The tree sequence files output during primary inference have a lot
101-
of debugging metadata, and there are some developer tools for inspecting
102-
this in the ``sc2ts.debug`` package. In particular, the ``ArgInfo``
103-
class has a lot of useful utilities designed to be used in a Jupyter
104-
notebook. Note that ``matplotlib`` is required for these. Use it like:
105-
106-
```python
107-
import sc2ts.debug as sd
108-
import tskit
109-
110-
ts = tskit.load("path_to_daily_inference.ts")
111-
ai = sd.ArgInfo(ts)
112-
ai # view summary in notebook
113-
```
114-
14+
Please see the online [documentation](https://tskit.dev/sc2ts/docs) for details
15+
on the software
16+
and the [preprint](https://www.biorxiv.org/content/10.1101/2023.06.08.544212v2)
17+
for information on the method and the inferred ARG.
11518

docs/_toc.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,6 @@ parts:
1313
chapters:
1414
- file: cli
1515
- file: api
16+
- caption: Misc
17+
chapters:
18+
- file: development

docs/arg_analysis.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,51 @@
11
(sec_arg_analysis)=
22
# ARG analysis
3+
4+
5+
## ARG analysis API
6+
7+
The sc2ts API provides two convenience functions to compute summary
8+
dataframes for the nodes and mutations in a sc2ts-output ARG.
9+
10+
To see some examples, first download the (31MB) sc2ts inferred ARG
11+
from [Zenodo](https://zenodo.org/records/17558489/):
12+
13+
```
14+
curl -O https://zenodo.org/records/17558489/files/sc2ts_viridian_v1.2.trees.tsz
15+
```
16+
17+
We can then use these like
18+
19+
```python
20+
import sc2ts
21+
import tszip
22+
23+
ts = tszip.load("sc2ts_viridian_v1.2.trees.tsz")
24+
25+
df_node = sc2ts.node_data(ts)
26+
df_mutation = sc2ts.mutation_data(ts)
27+
```
28+
29+
See the [live demo](https://tskit.dev/explore/lab/index.html?path=sc2ts.ipynb)
30+
for a browser based interactive demo of using these dataframes for
31+
real-time pandemic-scale analysis.
32+
33+
## Dataset API
34+
35+
Sc2ts also provides a convenient API for accessing large-scale
36+
alignments and metadata stored in
37+
[VCF Zarr](https://doi.org/10.1093/gigascience/giaf049) format.
38+
39+
Resources:
40+
41+
- See this [notebook](https://github.com/jeromekelleher/sc2ts-paper/blob/main/notebooks/example_data_processing.ipynb)
42+
for an example in which we access the data variant-by-variant and
43+
which explains the low-level data encoding
44+
- See the [VCF Zarr publication](https://doi.org/10.1093/gigascience/giaf049)
45+
for more details on and benchmarks on this dataset.
46+
47+
48+
**TODO** Add some references to API documentation
49+
50+
51+

docs/development.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
2+
# Development
3+
4+
To run the development dependencies use
5+
6+
```
7+
python3 -m pip install .[dev]
8+
```
9+
10+
To run the unit tests, use
11+
12+
```
13+
python3 -m pytest
14+
```
15+
16+
You may need to regenerate some cached test fixtures occasionaly (particularly
17+
if getting cryptic errors when running the test suite). To do this, run
18+
19+
```
20+
rm -fR tests/data/cache/
21+
```
22+
23+
and rerun tests as above.
24+
25+
## Debug utilities
26+
27+
The tree sequence files output during primary inference have a lot
28+
of debugging metadata, and there are some developer tools for inspecting
29+
this in the ``sc2ts.debug`` package. In particular, the ``ArgInfo``
30+
class has a lot of useful utilities designed to be used in a Jupyter
31+
notebook. Note that ``matplotlib`` is required for these. Use it like:
32+
33+
```python
34+
import sc2ts.debug as sd
35+
import tskit
36+
37+
ts = tskit.load("path_to_daily_inference.ts")
38+
ai = sd.ArgInfo(ts)
39+
ai # view summary in notebook
40+
```
41+
42+

pyproject.toml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,13 @@ dependencies = [
2121
]
2222
dynamic = ["version"]
2323

24+
[project.urls]
25+
Homepage = "https://tskit.dev/sc2ts"
26+
Documentation = "https://tskit.dev/sc2ts/docs/"
27+
"Bug Tracker" = "https://github.com/tskit-dev/sc2ts/issues"
28+
GitHub = "https://github.com/tskit-dev/sc2ts/"
29+
30+
2431
[project.scripts]
2532
sc2ts = "sc2ts.cli:cli"
2633

0 commit comments

Comments
 (0)