You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 12, 2024. It is now read-only.
Copy file name to clipboardExpand all lines: response-to-reviewers-2.tex
+88-27Lines changed: 88 additions & 27 deletions
Original file line number
Diff line number
Diff line change
@@ -57,11 +57,23 @@ \section*{Response to the editor}
57
57
\section*{Associate Editor's comments}
58
58
59
59
\begin{point}
60
-
My remaining broad concern is that the paper is still in places somewhat narrow about the goals of future ARG development. I certainly see the practical utility of dropping inference down to some minimum "knowable" structure that can be reconstructed using deterministic algorithms for very large datasets. However, probabilistic reconstructions of some form of ARG with more explicit events is also a reasonable goal moving forwards (e.g. for some applications we may want a subset of the recombination events explicitly included). There are a few places where the paper still comes across as overly dogmatic about the minimum "knowable" ARG being the only goal (although the discussion casts a broader view).
60
+
My remaining broad concern is that the paper is still in places somewhat narrow
61
+
about the goals of future ARG development. I certainly see the practical
62
+
utility of dropping inference down to some minimum ``knowable'' structure that
63
+
can be reconstructed using deterministic algorithms for very large datasets.
64
+
However, probabilistic reconstructions of some form of ARG with more explicit
65
+
events is also a reasonable goal moving forwards (e.g. for some applications we
66
+
may want a subset of the recombination events explicitly included). There are a
67
+
few places where the paper still comes across as overly dogmatic about the
68
+
minimum ``knowable'' ARG being the only goal (although the discussion casts a
69
+
broader view).
61
70
\end{point}
62
71
\begin{reply}
63
-
We have gone through the article and, in addition to the suggestions made below, have rephrased
64
-
parts to make it clear that a gARG can be used to encode a \emph{variety} of ARG structures, whether events are or are not explicitly inferred by the reconstruction method. We specifically state at the end of \emph{A diversity of structures} that
72
+
We have gone through the article and, in addition to the suggestions made
73
+
below, have rephrased parts to make it clear that a gARG can be used to encode
74
+
a \emph{variety} of ARG structures, whether events are or are not explicitly
75
+
inferred by the reconstruction method. We specifically state at the end of
76
+
\emph{A diversity of structures} that
65
77
\begin{quote}
66
78
A gARG can encode a diversity of ARG structures, including
67
79
those where events \emph{are} recorded explicitly, and those where
Abstract: "This approach is out of step with modern developments, which do not represent genetic inheritance in terms of these events or explicitly infer them." So this is on the places where I feel like the authors state things too strongly. The authors, and some others, approaches have taken this path, but folks can agree that the gARG is a good idea and yet think that explicitly inferring details of recombination events is a `modern' goal.
85
+
Abstract: ``This approach is out of step with modern developments, which do not
86
+
represent genetic inheritance in terms of these events or explicitly infer
87
+
them.'' So this is on the places where I feel like the authors state things too
88
+
strongly. The authors, and some others, approaches have taken this path, but
89
+
folks can agree that the gARG is a good idea and yet think that explicitly
90
+
inferring details of recombination events is a `modern' goal.
74
91
\end{point}
75
92
\begin{reply}
76
-
We have changed this to "This approach is out of step with many modern developments, however,..."
93
+
We have changed this to ``This approach is out of step with some modern developments,
94
+
however,...''
77
95
\end{reply}
78
96
79
97
\begin{point}
80
-
"Broadly speaking, an ARG describes the different paths of genetic inheritance caused by recombination, encapsulating the resulting complex web of genetic ancestry " - add "of a set of samples". Also I'd say "genetic ancestors", as ancestry is tied up with genetic ancestry groups in peoples' minds.
98
+
``Broadly speaking, an ARG describes the different paths of genetic inheritance
99
+
caused by recombination, encapsulating the resulting complex web of genetic
100
+
ancestry'' - add ``of a set of samples''. Also I'd say ``genetic ancestors'', as
101
+
ancestry is tied up with genetic ancestry groups in peoples' minds.
81
102
\end{point}
82
103
\begin{reply}
83
104
Amended as suggested.
84
105
\end{reply}
85
106
86
107
\begin{point}
87
-
"We define a genome as the complete set of genetic material that a child inherits from one parent. A diploid individual therefore carries two genomes, one inherited from each parent (we assume diploids here for clarity, but the definitions apply to organisms of arbitrary ploidy). " -Excludes Y, mtDNA, and X as written, please revise, e.g. talk about autosomal genome.
108
+
``We define a genome as the complete set of genetic material that a child
109
+
inherits from one parent. A diploid individual therefore carries two genomes,
110
+
one inherited from each parent (we assume diploids here for clarity, but the
111
+
definitions apply to organisms of arbitrary ploidy). '' -Excludes Y, mtDNA, and
112
+
X as written, please revise, e.g. talk about autosomal genome.
88
113
\end{point}
89
114
\begin{reply}
90
115
Amended as suggested.
91
116
\end{reply}
92
117
93
118
\begin{point}
94
-
"The topology of a gARG specifies that genetic inheritance occurred between particular ancestors and descendants, " -struggle slightly with word "particular" here as the identity of the ancestors is not known. Deleting "particular" is likely sufficient.
119
+
``The topology of a gARG specifies that genetic inheritance occurred between
120
+
particular ancestors and descendants, '' -struggle slightly with word
121
+
``particular" here as the identity of the ancestors is not known. Deleting
122
+
``particular" is likely sufficient.
95
123
\end{point}
96
124
\begin{reply}
97
125
Amended as suggested.
98
126
\end{reply}
99
127
100
128
\begin{point}
101
-
"This is sufficient to describe the effects of inheritance under any form of homologous recombination (such as multiple crossovers,..." -do you mean multiple crossovers during a single round of meiosis.
129
+
``This is sufficient to describe the effects of inheritance under any form of
130
+
homologous recombination (such as multiple crossovers,..." -do you mean
131
+
multiple crossovers during a single round of meiosis.
102
132
\end{point}
103
133
\begin{reply}
104
134
Yes - amended to clarify this.
105
135
\end{reply}
106
136
107
137
\begin{point}
108
-
"In this encoding there are two types of internal node in the graph, representing the common ancestor and recombination events in the history of a sample. " stipulate that these are most recent common ancestor events.
138
+
``In this encoding there are two types of internal node in the graph,
139
+
representing the common ancestor and recombination events in the history of a
140
+
sample. " stipulate that these are most recent common ancestor events.
109
141
\end{point}
110
142
\begin{reply}
111
143
Amended as suggested.
112
144
\end{reply}
113
145
114
146
\begin{point}
115
-
"This approach assumes all events are knowable, and does not provide an obvious mechanism for either aggregating multiple events or expressing uncertainty about them. While this is not a problem when describing the results of simulations". -Maybe one way to flip this around would be to say that because it arose from tracking a particular stochastic process it has these properties. Also I don't think it assumes that all events are knowable, eg we could construct some parsimonious ARG or probabilistic ARG. If we wish to express uncertainty about events we usually give draws from the posterior etc. I agree that might be computational prohibitive with large samples etc, but it seems like place to take a broad view. This seems like a place to acknowledge that for some applications we might want to explicitly reconstruct the events.
147
+
``This approach assumes all events are knowable, and does not provide an obvious
148
+
mechanism for either aggregating multiple events or expressing uncertainty
149
+
about them. While this is not a problem when describing the results of
150
+
simulations''. -Maybe one way to flip this around would be to say that because
151
+
it arose from tracking a particular stochastic process it has these properties.
152
+
Also I don't think it assumes that all events are knowable, eg we could
153
+
construct some parsimonious ARG or probabilistic ARG. If we wish to express
154
+
uncertainty about events we usually give draws from the posterior etc. I agree
155
+
that might be computational prohibitive with large samples etc, but it seems
156
+
like place to take a broad view. This seems like a place to acknowledge that
157
+
for some applications we might want to explicitly reconstruct the events.
116
158
\end{point}
117
159
\begin{reply}
118
160
We have rephrased this part to read
119
161
\begin{quote}
120
-
This approach necessitates that all events are recorded explicitly, and does not
121
-
provide an obvious mechanism for either aggregating multiple events
122
-
or expressing uncertainty about them. While this is not a
123
-
problem when describing the results of simulations, for instance (where all details
124
-
are perfectly known), it is an issue when we wish to
125
-
formally describe the output of inference methods which do not
126
-
necessarily attempt to infer events that are not \emph{knowable} from the data,
127
-
particularly as datasets approach the population scale...
128
162
\end{quote}
163
+
This approach
164
+
requires all events to be recorded explicitly, and does not
165
+
provide an obvious mechanism for aggregating multiple, potentially
166
+
unresolvable, events.
167
+
As datasets approach the population scale [citations]
168
+
representing such uncertainty
169
+
directly through the data structure is a useful alternative to
170
+
classical methods based on probabilistic sampling.
129
171
\end{reply}
130
172
131
173
\begin{point}
132
-
"A key feature of the gARG encoding is that it enables these varying levels of precision to be represented, and brings these nuanced features to light." -the word nuanced feels strange here.
174
+
``A key feature of the gARG encoding is that it enables these varying levels of
175
+
precision to be represented, and brings these nuanced features to light." -the
176
+
word nuanced feels strange here.
133
177
\end{point}
134
178
\begin{reply}
135
179
We have deleted the second part of this sentence.
136
180
\end{reply}
137
181
138
182
\begin{point}
139
-
"Simpler representations can be formed by removing "unknowable" nodes (Fig. 5B)" -unknowable is vague here, do you mean bubbles along a single lineage?
183
+
``Simpler representations can be formed by removing ``unknowable" nodes (Fig.
184
+
5B)" -unknowable is vague here, do you mean bubbles along a single lineage?
140
185
\end{point}
141
186
\begin{reply}
142
-
We've added a clarification that this refers to nodes such as those in singly-connected graph components.
187
+
We've added a clarification that this refers to nodes such as those
188
+
in singly-connected graph components.
143
189
\end{reply}
144
190
145
191
\begin{point}
146
-
"The gARG encoding leads to highly efficient storage and processing of ARG data, "-As gARG has various levels of precision, perhaps this needs to state that the "gARG encoding can lead to..." or be more precise that this is a reduced precision level.
192
+
``The gARG encoding leads to highly efficient storage and processing of ARG
193
+
data, "-As gARG has various levels of precision, perhaps this needs to state
194
+
that the "gARG encoding can lead to..." or be more precise that this is a
195
+
reduced precision level.
147
196
\end{point}
148
197
\begin{reply}
149
-
Amended as suggested to add "can lead to".
198
+
Amended as suggested to add ``can lead to".
150
199
\end{reply}
151
200
152
201
\begin{point}
153
-
"The succinct tree sequence data structure (usually known as a "tree sequence" for brevity) is a practical gARG implementation focused on efficiency." - If the tree sequence is focused at a particular level of gARG simplification be precise about this.
202
+
``The succinct tree sequence data structure (usually known as a ``tree sequence"
203
+
for brevity) is a practical gARG implementation focused on efficiency." - If
204
+
the tree sequence is focused at a particular level of gARG simplification be
205
+
precise about this.
154
206
\end{point}
155
207
\begin{reply}
156
-
We have left this sentence as is, since the tree sequence structure can record gARGs at various levels of simplification.
208
+
We have left this sentence as is, since the tree sequence structure
209
+
can record gARGs at various levels of simplification.
157
210
\end{reply}
158
211
159
212
\begin{point}
160
-
"Methods targeting large-scale datasets tend to simplify the inference problem by making a single, deterministic best-guess " --I think this is the best guess of the topology, and the uncertainty in times given the ARG is downstream of this. If so please clarify. Also I'd perhaps explicitly acknowledge Deng et al (SINGER), e.g. "deterministic best-guess of the topology (see Deng et al for parallel developments addressing uncertainty with somewhat small sample sizes)" or something like that. While these deterministic approaches are a strong way forward for human biobank scale data, it's good to be highlight parallel developments that might be key to other applications.
213
+
``Methods targeting large-scale datasets tend to simplify the inference problem
214
+
by making a single, deterministic best-guess " --I think this is the best guess
215
+
of the topology, and the uncertainty in times given the ARG is downstream of
216
+
this. If so please clarify. Also I'd perhaps explicitly acknowledge Deng et al
217
+
(SINGER), e.g. ``deterministic best-guess of the topology (see Deng et al for
218
+
parallel developments addressing uncertainty with somewhat small sample sizes)"
219
+
or something like that. While these deterministic approaches are a strong way
220
+
forward for human biobank scale data, it's good to be highlight parallel
221
+
developments that might be key to other applications.
0 commit comments