1
- Framework Overview
1
+ Background
2
2
=====================================
3
3
4
- CTF Components
5
- --------------
4
+ The Causal Testing Framework consists of 3 main components: 1) Causal Specification, 2) Causal Test Case and 3) Data Collection.
6
5
7
6
#.
8
- :doc: `Causal Specification <../modules/causal_specification >`\ : Before we can test software, we need to obtain an
9
- understanding of how it should behave in a particular use-case scenario. In addition, to apply graphical CI
7
+ :doc: `Causal Specification <../modules/causal_specification >`\ : To apply graphical CI
10
8
techniques for testing, we need a causal DAG which depicts causal relationships amongst inputs and outputs. To
11
9
collect this information, users must create a *causal specification *. This comprises a set of scenarios which place
12
10
constraints over input variables that capture the use-case of interest, a causal DAG corresponding to this scenario,
13
11
and a series of high-level functional requirements that the user wishes to test. In causal testing, these
14
12
requirements should describe how the model should respond to interventions (changes made to the input configuration).
15
13
14
+
15
+
16
16
#.
17
17
:doc: `Causal Tests <../modules/causal_tests >`\ : With a causal specification in hand, we can now go about designing
18
18
a series of test cases that interrogate the causal relationships of interest in the scenario-under-test. Informally,
19
- a causal test case is a triple (M, X, Delta, Y), where M is the modelling scenario, X is an input configuration,
20
- Delta is an intervention which should be applied to X , and Y is the expected *causal effect * of that intervention on
21
- some output of interest. Therefore, a causal test case states the expected causal effect (Y ) of a particular
22
- intervention (Delta) made to an input configuration (X ). For each scenario, the user should create a suite of causal
19
+ a causal test case is a triple `` (M, X, Delta, Y) `` , where `` M `` is the modelling scenario, `` X `` is an input configuration,
20
+ `` Delta `` is an intervention which should be applied to `` X `` , and `` Y `` is the expected *causal effect * of that intervention on
21
+ some output of interest. Therefore, a causal test case states the expected causal effect (`` Y `` ) of a particular
22
+ intervention (`` Delta `` ) made to an input configuration (`` X `` ). For each scenario, the user should create a suite of causal
23
23
tests. Once a causal test case has been defined, it is executed as follows:
24
24
25
-
26
- #. Using the causal DAG, identify an estimand for the effect of the intervention on the output of interest. That is,
25
+ a. Using the causal DAG, identify an estimand for the effect of the intervention on the output of interest. That is,
27
26
a statistical procedure capable of estimating the causal effect of the intervention on the output.
28
27
#. Collect the data to which the statistical procedure will be applied (see Data collection below).
29
28
#. Apply a statistical model (e.g. linear regression or causal forest) to the data to obtain a point estimate for
@@ -35,6 +34,8 @@ CTF Components
35
34
test should pass or fail based on the results. In the simplest case, this takes the form of an assertion which
36
35
compares the point estimate to the expected causal effect specified in the causal test case.
37
36
37
+
38
+
38
39
#.
39
40
:doc: `Data Collection <../modules/data_collector >`\ : Data for the system-under-test can be collected in two
40
41
ways: experimentally or observationally. The former involves executing the system-under-test under controlled
@@ -48,22 +49,3 @@ CTF Components
48
49
provide instructions for an execution that will fill the gap (future work).
49
50
50
51
For more information on each of these steps, follow the link to their respective documentation.
51
-
52
- Causal Inference Terminology
53
- ----------------------------
54
-
55
- Here are some explanations for the causal inference terminology used above.
56
-
57
-
58
- * Causal inference (CI) is a family of statistical techniques designed to quantify and establish **causal **
59
- relationships in data. In contrast to purely statistical techniques that are driven by associations in data, CI
60
- incorporates knowledge about the data-generating mechanisms behind relationships in data to derive causal conclusions.
61
- * One of the key advantages of CI is that it is possible to answer causal questions using **observational data **. That
62
- is, data which has been passively observed rather than collected from an experiment and, therefore, may contain all
63
- kinds of bias. In a testing context, we would like to leverage this advantage to test causal relationships in software
64
- without having to run costly experiments.
65
- * There are many forms of CI techniques with slightly different aims, but in this framework we focus on graphical CI
66
- techniques that use directed acyclic graphs to obtain causal estimates. These approaches used a causal DAG to explain
67
- the causal relationships that exist in data and, based on the structure of this graph, design statistical experiments
68
- capable of estimating the causal effect of a particular intervention or action, such as taking a drug or changing the
69
- value of an input variable.
0 commit comments