Skip to content

Commit deea236

Browse files
authored
Fix typos and improve README clarity
1 parent eaf8c5f commit deea236

File tree

1 file changed

+9
-9
lines changed

1 file changed

+9
-9
lines changed

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,13 @@
33
Sc2ts stands for "SARS-CoV-2 to tree sequence" (pronounced "scoots" optionally)
44
and consists of
55

6-
1. A method fo infer Ancestral Recombination Graphs (ARGs) from SARS-CoV-2
6+
1. A method to infer Ancestral Recombination Graphs (ARGs) from SARS-CoV-2
77
data at pandemic scale
88
2. A lightweight wrapper around [tskit Python APIs](https://tskit.dev/tskit/docs/stable/python-api.html) specialised for the output of sc2ts which enables efficient node metadata
99
access.
1010
3. A lightweight wrapper around [Zarr Python](https://zarr.dev) which enables
1111
convenient and efficient access to the full Viridian dataset (alignments and metadata)
12-
in a single file using [VCF Zarr specification](https://doi.org/10.1093/gigascience/giaf049).
12+
in a single file using the [VCF Zarr specification](https://doi.org/10.1093/gigascience/giaf049).
1313

1414
Please see the [preprint](https://www.biorxiv.org/content/10.1101/2023.06.08.544212v2)
1515
for details.
@@ -130,8 +130,8 @@ python -m sc2ts --help
130130
Primary inference is performed using the ``infer`` subcommand of the CLI,
131131
and all parameters are specified using a toml file.
132132

133-
Then inference under the [example config](example_config.toml)
134-
for little while to see how things work:
133+
The [example config file](example_config.toml) can be used to perform
134+
inference over a short period, to demonstrate how sc2ts works:
135135

136136
```
137137
python3 -m sc2ts infer example_config.toml --stop=2020-02-02
@@ -161,9 +161,9 @@ example_inference
161161
└── ex1.matches.db
162162
```
163163

164-
Here we've run inference for all dates in January 2020 for which we have data
165-
and Feb 01. The results of inference for each day is stored in the
166-
``example_inference/ex1`` directory as a tskit file representing the ARG
164+
Here we've run inference for all dates in January 2020 for which we have data, plus the 1st Feb.
165+
The results of inference for each day are stored in the
166+
``example_inference/ex1`` directory as tskit files representing the ARG
167167
inferred up to that day. There is a lot of redundancy in keeping all these
168168
daily files lying around, but it is useful to be able to go back to the
169169
state of the ARG at a particular date and they don't take up much space.
@@ -200,7 +200,7 @@ into the final ARG.
200200

201201
### Generating final analysis file
202202

203-
To generate the final analysis ready file (used as input to the analysis
203+
To generate the final analysis-ready file (used as input to the analysis
204204
APIs above) we need to run ``minimise-metadata``. This removes all but
205205
the most necessary metadata from the ARG, and recodes node metadata
206206
using the [struct codec](https://tskit.dev/tskit/docs/stable/metadata.html#structured-array-metadata)
@@ -266,7 +266,7 @@ and rerun tests as above.
266266

267267
### Debug utilities
268268

269-
The tree sequences files output during primary inference have a lot
269+
The tree sequence files output during primary inference have a lot
270270
of debugging metadata, and there are some developer tools for inspecting
271271
this in the ``sc2ts.debug`` package. In particular, the ``ArgInfo``
272272
class has a lot of useful utilities designed to be used in a Jupyter

0 commit comments

Comments
 (0)