@@ -12,175 +12,8 @@ system-under-test that is expected to cause a change to some output(s).
12
12
13
13
![ Causal Testing Workflow] ( images/workflow.png )
14
14
15
- The causal testing framework has three core components:
16
-
17
- 1 . [ Causal specification] ( causal_testing/specification/README.md ) : Before we can test software, we need to obtain an
18
- understanding of how it should behave in a particular use-case scenario. In addition, to apply graphical CI
19
- techniques for testing, we need a causal DAG which depicts causal relationships amongst inputs and outputs. To
20
- collect this information, users must create a _ causal specification_ . This comprises a set of scenarios which place
21
- constraints over input variables that capture the use-case of interest, a causal DAG corresponding to this scenario,
22
- and a series of high-level functional requirements that the user wishes to test. In causal testing, these
23
- requirements should describe how the model should respond to interventions (changes made to the input configuration).
24
-
25
- 2 . [ Causal tests] ( causal_testing/testing/README.md ) : With a causal specification in hand, we can now go about designing
26
- a series of test cases that interrogate the causal relationships of interest in the scenario-under-test. Informally,
27
- a causal test case is a triple (M, X, Delta, Y), where M is the modelling scenario, X is an input configuration,
28
- Delta is an intervention which should be applied to X, and Y is the expected _ causal effect_ of that intervention on
29
- some output of interest. Therefore, a causal test case states the expected causal effect (Y) of a particular
30
- intervention (Delta) made to an input configuration (X). For each scenario, the user should create a suite of causal
31
- tests. Once a causal test case has been defined, it is executed as follows:
32
- 1 . Using the causal DAG, identify an estimand for the effect of the intervention on the output of interest. That is,
33
- a statistical procedure capable of estimating the causal effect of the intervention on the output.
34
- 2 . Collect the data to which the statistical procedure will be applied (see Data collection below).
35
- 3 . Apply a statistical model (e.g. linear regression or causal forest) to the data to obtain a point estimate for
36
- the causal effect. Depending on the estimator used, confidence intervals may also be obtained at a specified
37
- confidence level e.g. 0.05 corresponds to 95% confidence intervals (optional).
38
- 4 . Return the casual test result including a point estimate and 95% confidence intervals, usually quantifying the
39
- average treatment effect (ATE).
40
- 5 . Implement and apply a test oracle to the causal test result - that is, a procedure that determines whether the
41
- test should pass or fail based on the results. In the simplest case, this takes the form of an assertion which
42
- compares the point estimate to the expected causal effect specified in the causal test case.
43
-
44
- 3 . [ Data collection] ( causal_testing/data_collection/README.md ) : Data for the system-under-test can be collected in two
45
- ways: experimentally or observationally. The former involves executing the system-under-test under controlled
46
- conditions which, by design, isolate the causal effect of interest (accurate but expensive), while the latter
47
- involves collecting suitable previous execution data and utilising our causal knowledge to draw causal inferences (
48
- potentially less accurate but efficient). To collect experimental data, the user must implement a single method which
49
- runs the system-under-test with a given input configuration. On the other hand, when dealing with observational data,
50
- we automatically check whether the data is suitable for the identified estimand in two steps. First, confirm whether
51
- the data contains a column for each variable in the causal DAG. Second, we check
52
- for [ positivity violations] ( https://www.youtube.com/watch?v=4xc8VkrF98w ) . If there are positivity violations, we can
53
- provide instructions for an execution that will fill the gap (future work).
54
-
55
- For more information on each of these steps, follow the link to their respective documentation.
56
-
57
- ## Causal Inference Terminology
58
-
59
- Here are some explanations for the causal inference terminology used above.
60
-
61
- - Causal inference (CI) is a family of statistical techniques designed to quantify and establish ** causal**
62
- relationships in data. In contrast to purely statistical techniques that are driven by associations in data, CI
63
- incorporates knowledge about the data-generating mechanisms behind relationships in data to derive causal conclusions.
64
- - One of the key advantages of CI is that it is possible to answer causal questions using ** observational data** . That
65
- is, data which has been passively observed rather than collected from an experiment and, therefore, may contain all
66
- kinds of bias. In a testing context, we would like to leverage this advantage to test causal relationships in software
67
- without having to run costly experiments.
68
- - There are many forms of CI techniques with slightly different aims, but in this framework we focus on graphical CI
69
- techniques that use directed acyclic graphs to obtain causal estimates. These approaches used a causal DAG to explain
70
- the causal relationships that exist in data and, based on the structure of this graph, design statistical experiments
71
- capable of estimating the causal effect of a particular intervention or action, such as taking a drug or changing the
72
- value of an input variable.
73
15
74
16
## Installation
75
17
76
18
See the readthedocs site for [ installation
77
19
instructions] ( https://causal-testing-framework.readthedocs.io/en/latest/installation.html ) .
78
-
79
- ## Usage
80
-
81
- There are currently two methods of using the Causal Testing Framework, through
82
- the [ JSON Front End] ( https://causal-testing-framework.readthedocs.io/en/latest/json_front_end.html ) or directly as
83
- described below.
84
-
85
- The causal testing framework is made up of three main components: Specification, Testing, and Data Collection. The first
86
- step is to specify the (part of the) system under test as a modelling ` Scenario ` . Modelling scenarios specify the
87
- observable variables and any constraints which exist between them. We currently support three types of variable:
88
-
89
- - ` Input ` variables are input parameters to the system.
90
- - ` Output ` variables are outputs from the system.
91
- - ` Meta ` variables are not directly observable but are relevant to system testing, e.g. a model may take a ` location `
92
- parameter and expand this out into ` average_age ` and ` household_size ` variables "under the hood". These parameters can
93
- be made explicit by instantiating them as metavariables.
94
-
95
- To instantiate a scenario, simply provide a set of variables and an optional set of constraints, e.g.
96
-
97
- ``` {python}
98
- from causal_testing.specification.variable import Input, Output, Meta
99
- from causal_testing.specification.scenario import Scenario
100
-
101
- x = Input("x", int) # Define an input with name "x" of type int
102
- y = Output("y", float) # Define an output with name "y" of type float
103
- z = Meta("y", int) # Define a meta with name "z" of type int
104
-
105
- modelling_scenario = Scenario({x, y, z}, {x > z, z < 3}) # Define a scenario with the three variables and two constraints
106
- ```
107
-
108
- Note that scenario constraints are primarily intended to help specify the region of the input space under test in a
109
- manner consistent with the Category Partition Method. It is not intended to serve as a test oracle. Use constraints
110
- sparingly and with caution to avoid introducing data selection bias. We use Z3 to handle constraints. For help with
111
- this, check out [ their documentation] ( https://ericpony.github.io/z3py-tutorial/guide-examples.htm ) .
112
-
113
- Having fully specified the modelling scenario, we are now ready to test. Causal tests are,
114
- essentially [ metamorphic tests] ( https://en.wikipedia.org/wiki/Metamorphic_testing ) which are executed using statistical
115
- causal inference. A causal test expresses the change in a given output that we expect to see when we change a particular
116
- input in a particular way. A causal test case is built from a base test case, which specifies the relationship between
117
- the given output and input and the desired effect. This information is the minimum required to perform identification
118
-
119
- ``` {python}
120
- from causal_testing.testing.base_test_case import BaseTestCase
121
- from causal_testing.testing.causal_test_case import CausalTestCase
122
- from causal_testing.testing.causal_test_outcome import Positive
123
-
124
- base_test_case = BaseTestCase(
125
- treatment_variable = x, # Set the treatment (input) variable to x
126
- outcome_variable = y, # set the outcome (output) variable to y
127
- effect = direct) # effect type, current accepted types are direct and total
128
-
129
- causal_test_case = CausalTestCase(
130
- base_test_case = base_test_case,
131
- expected_causal_effect = Positive, # We expect to see a positive change as a result of this
132
- control_value = 0 # Set the unmodified (control) value for x to 0,
133
- treatment_value = 1 # Set the modified (treatment) value for x to ,1
134
- estimate_type = "ate"),
135
- ```
136
-
137
- Before we can run our test case, we first need data. There are two ways to acquire this: 1. run the model with the
138
- specific input configurations we're interested in, 2. use data from previous model runs. For a small number of specific
139
- tests where accuracy is critical, the first approach will yield the best results. To do this, you need to instantiate
140
- the ` ExperimentalDataCollector ` class.
141
-
142
- Where there are many test cases using pre-existing data is likely to be faster. If the program's behaviour can be
143
- estimated statistically, the results should still be reliable as long as there is enough data for the estimator to work
144
- as intended. This will vary depending on the program and the estimator. To use this method, simply instantiate
145
- the ` ObservationalDataCollector ` class with the modelling scenario and a path to the CSV file containing the runtime
146
- data, e.g.
147
-
148
- ``` {python}
149
- data_csv_path = 'results/data.csv'
150
- data_collector = ObservationalDataCollector(modelling_scenario, data_csv_path)
151
- ```
152
-
153
- The actual running of the tests is done using the ` CausalTestEngine ` class. The setup of the test engine is as follows:
154
-
155
- ``` {python}
156
- from causal_testing.testing.causal_test_engine import CausalTestEngine
157
-
158
- causal_test_engine = CausalTestEngine(causal_specification, data_collector) # Instantiate the causal test engine
159
- ```
160
-
161
- Whether using fresh or pre-existing data, a key aspect of causal inference is estimation. To actually execute a test, we
162
- need an estimator. We currently support two estimators: linear regression and causal forest. The estimators require the
163
- minimal adjustment set from the causal_dag. This and the estimator can be instantiated as per
164
- the [ documentation] ( https://causal-testing-framework.readthedocs.io/en/latest/autoapi/causal_testing/testing/estimators/index.html ) .
165
-
166
- ``` {python}
167
- from causal_testing.testing.estimators import LinearRegressionEstimator
168
-
169
- minimal_adjustment_set = causal_dag.identification(base_test_case)
170
- estimation_model = LinearRegressionEstimator("x",), 0, 1, minimal_adjustment_set, ("y",), causal_test_engine.scenario_execution_data_df)
171
- ```
172
-
173
- We can now execute the test using the estimation model. This returns a causal test result, from which we can extract
174
- various information. Here, we simply assert that the observed result is (on average) what we expect to see.
175
-
176
- ``` {python}
177
- causal_test_result = causal_test_engine.execute_test(
178
- estimator = estimation_model,
179
- causal_test_case = causal_test_case,
180
- estimate_type = "ate")
181
- test_passes = causal_test_case.expected_causal_effect.apply(causal_test_result)
182
- assert test_passes, "Expected to see a positive change in y."
183
- ```
184
-
185
- Multiple tests can be executed at once using the test engines [ test_suite] ( https://causal-testing-framework.readthedocs.io/en/test_suite.html )
186
- feature
0 commit comments