How should we deal with heterogeneous treatment effects? #26

AndrewC19 · 2021-10-28T16:01:00Z

AndrewC19
Oct 28, 2021
Maintainer

Background:
Currently, CauseCumber is designed to test how computational models respond to specific interventions in modelling scenarios. We use this approach to ask and answer causal questions about the model-under-test.

For example, we might want to answer the question: Does the Pfizer vaccine reduce the cumulative number of infections in Covasim? To answer this question, we design a scenario in which we test the causal effect of the Pfizer vaccine. This involves executing the model with and without the Pfizer vaccine as an intervention and observing the change in cumulative number of infections. Since the model is non-deterministic, we should repeat this multiple times and take the average.

In this example, our question is not very specific. We could simulate this scenario in millions of different ways which would each have its own valid outcome. For example, we might change the location, population size etc. Some of these are effect modifiers, which means we expect the causal effect of the vaccine to change within different strata of these variables (e.g. vaccine might be more effective in Japan than the UK). This results in heterogeneous treatment effects.

Heterogeneity of treatment effect (HTE) is the nonrandom, explainable variability in the direction and magnitude of treatment effects for individuals within a population.

Problem:
Since we can have heterogeneous treatment effects, we need to be careful to specify what quantity we are actually interested in. Are we interested in the effect of the vaccine in Japan or the UK? Or do we not care about the location?

In some cases, we might want to know the treatment effect of vaccine in the different strata separately. In this case, we should report the conditional average treatment effect (CATE), which is the per-strata ATE for each of the specified effect modifiers (ATE in Japan and ATE in the UK separately).

In other cases, we might not be interested in the CATE and only care about the ATE of a particular population instead. For example, we might care about the effect of the vaccine in countries in Europe or the effect of the vaccine in densely populated areas. This requires us to define the population of model executions that we care about.

Currently, we specify the input configuration for a particular feature file in the Background template. This assigns concrete values to certain parameters that are important to the feature we are testing. However, this is too restrictive as it says each instance of this feature must have this exact set of values. Therefore, if we wanted to use observational data to infer the outcome of a test case, it would have to contain exactly these values. Also, if we wanted to look at the ATE of the vaccine in European countries, for example, we would need a way to say that our Scenarios can use any country in Europe rather than a specific one.

Solution:
I think we should find a way to use the Background template to describe the population of model executions that a feature concerns in a more open way i.e. map inputs to probability distributions, ranges, or concrete values.

Also, to allow users to compute ATE and CATE, we should allow them to list effect modifiers. If they have effect modifiers, we can report the ATE within each strata of the effect modifier (e.g. ATE for each country in Europe) and the overall ATE (ATE across Europe). This will require further thought, though.

jmafoster1 · 2021-11-09T08:42:30Z

jmafoster1
Nov 9, 2021
Maintainer

I've already been experimenting with effect modifiers and I think I've got something working. The problem is that you need to implement custom binning because dowhy, despite claiming to, is completely unable to do this automatically in a meaningful way. As for specifying distributions, ranges, etc. I think this should be relatively straightforward. As we discussed last week, we can then use this as a filter for the observational data, potentially removing the need to bin the data according to effect modifiers and calculate ATE. I'm not sure whether it's better to bin and calculate CATE or filter and calculate ATE. I guess the results should be the same...

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How should we deal with heterogeneous treatment effects? #26

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How should we deal with heterogeneous treatment effects? #26

Uh oh!

AndrewC19 Oct 28, 2021 Maintainer

Replies: 1 comment

Uh oh!

jmafoster1 Nov 9, 2021 Maintainer

AndrewC19
Oct 28, 2021
Maintainer

jmafoster1
Nov 9, 2021
Maintainer