Skip to content

Commit ec9e3d4

Browse files
authored
Merge pull request #588 from hyanwong/doc-shell-clarification
Make clear python vs shell commands
2 parents a9eb309 + d3d4255 commit ec9e3d4

File tree

1 file changed

+9
-6
lines changed

1 file changed

+9
-6
lines changed

docs/tutorial.rst

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -118,8 +118,8 @@ toy example. However, for real data we will not prepare our data and infer the t
118118
in one go; rather, we will usually split the process into at least two distinct steps.
119119

120120
The first step in any inference is to prepare your data and import it into a :ref:`sample data
121-
<sec_file_formats_samples>` file. For simplicity here we'll simulate some data under the
122-
coalescent with recombination using `msprime
121+
<sec_file_formats_samples>` file. For simplicity here we'll use Python to simulate some
122+
data under the coalescent with recombination, using `msprime
123123
<https://msprime.readthedocs.io/en/stable/api.html#msprime.simulate>`_:
124124

125125
.. code-block:: python
@@ -180,7 +180,7 @@ import the data for this simulation into ``tsinfer``'s sample data format.
180180
tsinfer.SampleData.from_tree_sequence(
181181
ts, path="simulation.samples", num_flush_threads=2, use_times=False)
182182

183-
Examining the files, we then see the following::
183+
Examining the files on the command line, we then see the following::
184184

185185
$ ls -lh simulation*
186186
-rw-r--r-- 1 jk jk 22M May 12 11:06 simulation.samples
@@ -224,7 +224,9 @@ actual data) requires about 390MB uncompressed. The ``tsinfer`` sample data form
224224
achieving a roughly 20X compression in this case. In practise this means we can keep such files
225225
lying around without taking up too much space.
226226

227-
Once we have our ``.samples`` file created, running the inference is straightforward::
227+
Once we have our ``.samples`` file created, running the inference is straightforward.
228+
We can do so within Python (as we did in the toy example above), or use ``tsinfer`` on
229+
the command-line, which is useful when inference is expected to take a long time::
228230

229231
$ tsinfer infer simulation.samples -p -t 4
230232
ga-add (1/6): 100%|███████████████████████| 35.2K/35.2K [00:02, 15.3Kit/s]
@@ -252,7 +254,8 @@ Looking at our output files, we see::
252254

253255
Therefore our output tree sequence file that we have just inferred in less than five minutes is
254256
*even smaller* than the original ``msprime`` simulated tree sequence! Because the output file is
255-
also a :class:`tskit.TreeSequence`, we can use the same API to work with both.
257+
also a :class:`tskit.TreeSequence`, we can use the same API to work with both, for example,
258+
within Python we can do:
256259

257260
.. code-block:: python
258261
@@ -273,7 +276,7 @@ also a :class:`tskit.TreeSequence`, we can use the same API to work with both.
273276
print("Inferred tree: interval=", tree.interval)
274277
print(tree.draw(format="unicode"))
275278
276-
Here we first load up our source and inferred tree sequences from their corresponding
279+
This first loads up our source and inferred tree sequences from their corresponding
277280
``.trees`` files. Each of the trees in these tree sequences has 10 thousand samples
278281
which is much too large to easily visualise. Therefore, to make things simple here
279282
we subset both tree sequences down to their minimal representations for six

0 commit comments

Comments
 (0)