Skip to content

Commit 4c1410b

Browse files
committed
Nick and Neil's comments
1 parent 97ae1bf commit 4c1410b

File tree

2 files changed

+31
-10
lines changed

2 files changed

+31
-10
lines changed

paper/paper.bib

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,27 @@ @inproceedings{guderlei2007smt
6767
year = {2007}
6868
}
6969

70+
@book{hernan2020causal,
71+
address = {Boca Raton, FL},
72+
author = {Hern{\'a}n, Miguel A and Robins, James M},
73+
publisher = {Chapman \& Hall/CRC},
74+
title = {Causal {I}nference: {What} if},
75+
year = {2020}
76+
}
77+
78+
@book{pearl2009causality,
79+
address = {Cambridge},
80+
author = {Judea Pearl},
81+
day = {14},
82+
isbn = {9780521895606},
83+
month = {09},
84+
pagecount = {464},
85+
publisher = {Cambridge university press},
86+
subtitle = {Models, Reasoning, and Infernce},
87+
title = {Causality},
88+
year = {2009}
89+
}
90+
7091
@misc{sharma2020dowhy,
7192
archiveprefix = {arXiv},
7293
author = {Amit Sharma and Emre Kiciman},
@@ -78,17 +99,17 @@ @misc{sharma2020dowhy
7899
}
79100

80101
@article{somers2024configuration,
81-
doi = {10.2139/ssrn.4732706},
82102
author = {Somers, Richard and Walkinshaw, Neil and Hierons, Robert and Elliott, Jackie and Iqbal, Ahmed and Walkinshaw, Emma},
103+
doi = {10.2139/ssrn.4732706},
83104
publisher = {Elsevier BV},
84105
title = {Configuration Testing of an Artificial Pancreas System Using a Digital Twin},
85106
year = {2024}
86107
}
87108

88109
@article{textor2017dagitty,
110+
author = {Textor, Johannes and van der Zander, Benito and Gilthorpe, Mark S. and Liśkiewicz, Maciej and Ellison, George T.H.},
89111
doi = {10.1093/ije/dyw341},
90112
issn = {1464-3685},
91-
author = {Textor, Johannes and van der Zander, Benito and Gilthorpe, Mark S. and Liśkiewicz, Maciej and Ellison, George T.H.},
92113
journal = {International Journal of Epidemiology},
93114
month = {jan},
94115
pages = {dyw341},

paper/paper.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ authors:
2727
- name: Richard Somers
2828
orcid: 0009-0009-1195-1497
2929
affiliation: 1
30-
- name: Nicholas Lattimer
30+
- name: Nicholas Latimer
3131
orcid: 0000-0001-5304-5585
3232
affiliation: 1
3333
- name: Neil Walkinshaw
@@ -45,7 +45,7 @@ bibliography: paper.bib
4545

4646
# Summary
4747
Scientific models possess several properties that make them notoriously difficult to test, including a complex input space, long execution times, and non-determinism, rendering existing testing techniques impractical.
48-
In fields such as epidemiology, where researchers seek answers to challenging causal questions, a statistical methodology known as Causal Inference (CI) has addressed similar problems, enabling the inference of causal conclusions from noisy, biased, and sparse observational data instead of costly randomised trials.
48+
In fields such as epidemiology, where researchers seek answers to challenging causal questions, a statistical methodology known as Causal Inference (CI) [@pearl2009causality,@hernan2020causal] has addressed similar problems, enabling the inference of causal conclusions from noisy, biased, and sparse observational data instead of costly randomised trials.
4949
CI works by using domain knowledge to identify and mitigate for biases in the data, enabling them to answer causal questions that concern the effect of changing some feature on the observed outcome.
5050
The Causal Testing Framework (CTF) is a software testing framework that uses CI techniques to establish causal effects between software variables from pre-existing runtime data rather than having to collect bespoke, highly curated datasets especially for testing.
5151

@@ -56,18 +56,18 @@ Nondeterministic software can be tested using Statistical Metamorphic Testing [@
5656
However, this requires the software to be executed repeatedly for each set of parameters of interest, so is computationally expensive, and is constrained to testing properties over software inputs that can be directly and precisely controlled.
5757
Statistical Metamorphic Testing cannot be used to test properties that relate internal variables or outputs to each other, since these cannot be controlled a priori.
5858

59-
By employing domain knowledge in the form of a causal graph --- a lightweight model specifying the expected relationships between key software variables --- the CTF circumvents both of these problems by enabling models to be tested using pre-existing runtime data.
59+
By employing domain knowledge in the form of a causal graph --- a lightweight model specifying the expected relationships between key software variables --- the CTF overcomes the limitations of Statistical Metamorphic Testing by enabling models to be tested using pre-existing runtime data.
6060
The CTF is written in Python but is language agnostic in terms of the system under test.
6161
All that is required is a set of properties to be validated, a causal model, and a set of software runtime data.
6262

6363
# Causal Testing
64-
Causal Testing [@clark2023testing] has four main steps, outlined in \ref{fig:schematic}.
65-
Firstly, the user supplies a causal model, which takes the form of a directed acyclic graph (DAG) where an edge $X \to Y$ represents variable $X$ having a direct causal effect on variable $Y$.
64+
Causal Testing [@clark2023testing] has four main steps, outlined in Figure \ref{fig:schematic}.
65+
Firstly, the user supplies a causal model, which takes the form of a directed acyclic graph (DAG) [@pearl2009causality] where an edge $X \to Y$ represents variable $X$ having a direct causal effect on variable $Y$.
6666
Secondly, the user supplies a set of causal properties to be tested.
6767
Such properties can be generated from the causal DAG [@clark2023metamorphic]: for each $X \to Y$ edge, a test to validate the presence of a causal effect is generated, and for each missing edge, a test to validate independence is generated.
6868
The user may also refine tests to validate the nature of a particular relationship.
6969
Next, the user supplies a set of runtime data in the form of a table with each column representing a variable and rows containing the value of each variable for a particular run of the software.
70-
Finally, the CTF automatically validates the causal properties by using the causal DAG and data to calculate a causal effect estimate, and validating this against the expected causal relationship.
70+
Finally, the CTF automatically validates the causal properties by using the causal DAG to identify a statistical estimand [@pearl2009causality] (essentially a set of features in the data which must be controlled for), calculate a causal effect estimate from the supplied data, and validating this against the expected causal relationship.
7171

7272
![Causal Testing workflow.\label{fig:schematic}](../images/schematic.png)
7373

@@ -78,12 +78,12 @@ The CTF instead evaluates the adequacy of a particular dataset by calculating a
7878
## Missing Variables
7979
Causal Testing works by using the causal DAG to identify the variables that need to be statistically controlled for to remove their biassing effect on the causal estimate.
8080
This typically means we need to know their values.
81-
However, where such biassing variables are not recorded in the data, the Causal Testing Framework can still sometimes estimate unbiased causal effects by using Instrumental Variables, an advanced Causal Inference technique.
81+
However, where such biassing variables are not recorded in the data, the Causal Testing Framework can still sometimes estimate unbiased causal effects by using Instrumental Variables [@hernan2020causal], an advanced Causal Inference technique.
8282

8383
## Feedback Over Time
8484
Many scientific models involve iterating several interacting processes over time.
8585
These processes often feed into each other, and can create feedback cycles.
86-
Traditional CI cannot handle this, however the CTF uses a family of advanced CI techniques, called g-methods, to enable the estimation of causal effects even when there are feedback cycles between variables.
86+
Traditional CI cannot handle this, however the CTF uses a family of advanced CI techniques, called g-methods [@hernan2020causal], to enable the estimation of causal effects even when there are feedback cycles between variables.
8787

8888
# Related Work
8989
The Dagitty tool [@textor2017dagitty] is a browser-based environment for creating, editing, and analysing causal graphs.

0 commit comments

Comments
 (0)