Skip to content

Commit bb410ec

Browse files
authored
Add files via upload
JOSS paper somewhat redrafted.
1 parent 02994b0 commit bb410ec

File tree

1 file changed

+65
-65
lines changed

1 file changed

+65
-65
lines changed

paper/paper.md

Lines changed: 65 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -23,53 +23,52 @@ bibliography: paper.bib
2323
## Scope
2424

2525
This software is relevant to a situation in which:
26-
* the value of some dependent variable depends on the values of some
27-
collection of zero or more independent variables;
26+
27+
* the value of some dependent variable depends on the values of zero
28+
or more independent variables;
2829
* there exists a data set of empirical measurements of the value of
2930
the dependent variable, each accompanied by empirical measurements
30-
of the corresponding values of all the independent variables;
31+
of corresponding values of all the independent variables;
3132
* there exists at least one theoretical model for predicting the value
3233
of the dependent variable from the values of the independent
3334
variables;
3435
* each model contains zero or more adjustable parameters,
35-
i.e. quantities asserted in the model to be constant, and which
36-
affect the mapping from independent variable values to dependent
37-
variable value, but whose exact values are not known a priori; and
36+
i.e. constant quantities which affect the mapping from independent
37+
variable values to dependent variable value, whose exact values are
38+
not known a priori; and
3839
* one wishes to infer from the data the values of the parameters
3940
in each model, and, if there is more than one model, which model is
4041
most probably true.
4142

4243
## Action
4344

44-
In Bayesian parameter estimation, beliefs about the values of
45-
parameters, in light of data, can [@Jeffreys:1931:SI;
46-
@Jeffreys:1932:TEL] be summarized in a posterior expectation value and
47-
standard error. If the model has more than one parameter, the
48-
posterior standard error comes [@Jeffreys:1931:SI; @Jeffreys:1932:TEL]
49-
in two versions: a conditional standard error, based on the assumption
50-
that the other parameters take their posterior expectation values; and
51-
a marginal standard error, which is a probability-weighted average
52-
over all values of the other parameters. In Bayesian model
53-
comparison, the goodness of fit of a model to data is
54-
[@Jeffreys:1935:STS] measured by a quantity known as the marginal
55-
likelihood, which automatically embodies Occam's razor and is
56-
therefore suitable for direct comparison with other models.
57-
58-
However, in software procurement terms, it is much easier to obtain
59-
software packages and libraries which, rather than Bayesian parameter
60-
estimation and model comparison, instead perform least squares fitting
61-
[qv. @Legendre:1805:NMD], which defines a badness of fit measure
62-
called chi-squared for a model _with specific values of its
63-
parameters_. Software typically outputs the minimum value of
64-
chi-squared with respect to the parameters, estimators of the
65-
parameters computed as the point in parameter space that achieves that
66-
minimum chi-squared, and some or all of the elements of the Hessian of
67-
chi-squared with respect to the parameters at that point in parameter
68-
space, intended for use in estimating the standard errors of the
69-
parameters by heuristic methods. Software documentation also often
70-
suggests heuristic ways of attempting model comparison with these
71-
outputs. This is convenient, but lacks clear epistemological
72-
underpinning.
45+
In Bayesian parameter estimation, beliefs about parameter values, in
46+
light of data, can [@Jeffreys:1931:SI; @Jeffreys:1932:TEL] be
47+
summarized in a posterior expectation value and standard error. If
48+
the model has more than one parameter, the posterior standard error
49+
has [@Jeffreys:1931:SI; @Jeffreys:1932:TEL] two variants: a
50+
conditional standard error, based on the assumption that the other
51+
parameters take their posterior expectation values; and a marginal
52+
standard error, which involves a probability-weighted average over all
53+
values of the other parameters. In Bayesian model comparison, the
54+
goodness of fit of a model to data is [@Jeffreys:1935:STS] measured by
55+
a quantity known as the marginal likelihood, which automatically
56+
embodies [@MacKay:1992:BI] Occam's razor and is therefore suitable for
57+
direct comparison with other models.
58+
59+
However, it is much easier to obtain software packages and libraries
60+
which, rather than Bayesian parameter estimation and model comparison,
61+
instead perform least squares fitting [qv. @Legendre:1805:NMD], which
62+
defines a badness of fit measure called chi-squared for a model _with
63+
specific values of its parameters_. Software typically outputs the
64+
minimum value of chi-squared with respect to the parameters; the
65+
parameter values that achieve that minimum chi-squared; and some or
66+
all of the elements of the Hessian of chi-squared with respect to the
67+
parameters at that point in parameter space, intended for use in
68+
estimating the standard errors of the parameters by heuristic methods.
69+
Software documentation also often suggests heuristic ways of
70+
attempting model comparison with these outputs. This is convenient,
71+
but lacks clear epistemological underpinning.
7372

7473
By a happy coincidence, however, these typical outputs of
7574
least-squares fitting software contain enough information to obtain,
@@ -79,11 +78,12 @@ versions) of the parameters, and to the marginal likelihood. This
7978
calculation is not entirely straightforward, and that is where the
8079
present package, `leastsqtobayes`, comes in. It is written in a
8180
combination of three languages: Gnuplot [@Williams:2015:GRM], which
82-
has powerful built-in capabilities for least-squares fitting; Octave
81+
has powerful (indeed uniquely suitable in its handling of measurement
82+
uncertainties) built-in capabilities for least-squares fitting; Octave
8383
[@Eaton:2012:GOR], which has the matrix and scalar arithmetic
84-
capabilities for the conversion; and Perl [@Wall:2022:P], with the
85-
text-processing capabilities to create bespoke sections Octave code
86-
containing results produced by Gnuplot and vice versa.
84+
capabilities for the calculations; and Perl [@Wall:2022:P], with the
85+
text-processing capabilities to facilitate intercommunication between
86+
Gnuplot and Octave code.
8787

8888
`leastsqtobayes` takes as input an empirical data set; a formula
8989
representing a model; and specifications of the prior
@@ -96,21 +96,22 @@ parameters; and the marginal likelihood.
9696
## Parameter estimation step
9797

9898
Several recent workers [e.g. @Fenton:2022:UVA; @Albert:2022:BAE;
99-
@Gerster:2022:EBC] have, in attempting to infer the posterior
100-
expectations and standard errors of parameters in a model from data,
101-
particularly in cases where the distinction between conditional
102-
standard errors and marginal standard errors matters, found themselves
103-
unable to obtain those values with "off the peg" least-squares fitting
104-
processes, and have had to resort to bespoke computational approaches.
105-
Indeed, the present author has [@Hatton:2003:SPE; @Sammonds:2017:MSI]
106-
found himself in the same position.
99+
@Gerster:2022:EBC], in common with the present author
100+
[@Hatton:2003:SPE; @Sammonds:2017:MSI], have, in attempting to infer
101+
posterior expectations and standard errors of parameters in a model
102+
from data, found themselves unable to obtain those values with "off
103+
the peg" least-squares fitting processes, and have had to resort to
104+
bespoke computational approaches, with all the risks and duplication
105+
of effort in software quality control this implies. This has been
106+
particularly prevalent in cases where the distinction between
107+
conditional and marginal standard errors matters.
107108

108109
## Model comparison step
109110

110111
@Dunstan:2022:ECB argue that all users of least-squares fitting should
111112
supply the value of the marginal likelihood for each model they fit,
112113
and note the current ubiety of applications of least-squares fitting
113-
in which such a value is not supplied. They further point out that a
114+
in which no such value is supplied. They further point out that a
114115
reason for general non-reporting of marginal likelihood values is the
115116
perceived computational complexity of obtaining those values.
116117

@@ -123,20 +124,19 @@ formulae having been in the open literature for decades,
123124
@Dunstan:2022:ECB attribute the perceived complexity of computing the
124125
marginal likelihood, which they believe leads to the absence of
125126
marginal likelihood computations in most applications of least-squares
126-
fitting, to a failure to use these formulae.
127-
128-
However, @Dunstan:2022:ECB leave as an exercise for the reader the
129-
implementation issues of how to extract, from the outputs of standard
130-
least-squares fitting software, the information required as input to
131-
the @Lindley:1980:ABM and @Kass:1995:BF formulae, and how to perform
132-
the actual computation. The primary challenge in that computation is
133-
that it involves the determinant of the Hessian of the chi-squared
134-
statistic with respect to the parameters, at the location in parameter
135-
space that minimizes chi-squared. That is where the present software,
136-
`leastsqtobayes`, comes in: it resolves these issues by having the
137-
Gnuplot least-squares fitting system output its results in a format
138-
suitable for direct import to the Octave scientific programming
139-
language, with its inbuilt determinant-finding capability, then have
140-
Octave compute the determinant, by way of using Perl's text-processing
141-
capabilities to generate bespoke Gnuplot and Octave code for any given
142-
inference problem.
127+
fitting, to a widespread failure to use these formulae.
128+
129+
# From the need to the software
130+
131+
However, @Dunstan:2022:ECB leave as an exercise for the reader finding
132+
how to extract, from the outputs of standard least-squares fitting
133+
software, the information required as input to the @Lindley:1980:ABM
134+
and @Kass:1995:BF formulae, and how to perform the actual computation.
135+
Both are somewhat challenging, the former because Gnuplot outputs its
136+
results in a format that is not very readily interoperable with other
137+
systems, and the latter because of the need to compute the determinant
138+
of the Hessian. That is where the present software, `leastsqtobayes`,
139+
comes in: it resolves the former issue using the text-processing
140+
capabilities of Perl, and the latter using the matrix algebra
141+
capabilities of Octave. The final output to the user includes all the
142+
quantities for which an inferential need is identified above.

0 commit comments

Comments
 (0)