Skip to content

Commit d7fd1e3

Browse files
committed
add intro + need for JOSS
1 parent 9bfdd98 commit d7fd1e3

File tree

1 file changed

+53
-30
lines changed

1 file changed

+53
-30
lines changed

paper/paper.md

Lines changed: 53 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -23,26 +23,49 @@ affiliations:
2323
index: 2
2424
- name: Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University
2525
index: 3
26-
date: 28 September 2020
26+
date: 06 October 2020
2727
bibliography: refs.bib
2828
---
2929

3030
# Summary
3131

32-
The `txshift` `R` package aims to provide researchers in (bio)statistics,
33-
epidemiology, health policy, econometrics, and related disciplines with access
34-
to state-of-the-art statistical methodology for evaluating the causal effects of
35-
stochastic shift interventions on _continuous-valued_ exposures. `txshift`
36-
estimates the causal effects of modified treatment policies (or "feasible
37-
interventions"), which take into account the natural value of an exposure in
38-
assigning an intervention level. To accommodate use in study designs
39-
incorporating two-phase sampling (e.g., case-control), the package provides two
40-
types of modern corrections, both rooted in semiparametric theory, for
41-
constructing unbiased and efficient estimates, despite the significant
42-
limitations induced by such designs. Thus, `txshift` makes possible the
43-
estimation of the causal effects of stochastic interventions in experimental and
44-
observational study settings subject to real-world design limitations that
45-
commonly arise in modern scientific practice.
32+
Statistical causal inference has traditionally focused on effects defined by
33+
inflexible static interventions, applicable only to binary or categorical
34+
exposures. The evaluation of such interventions is often plagued by many
35+
problems, both theoretical (e.g., non-identification) and practical (e.g.,
36+
positivity violations); however, stochastic interventions provide a promising
37+
solution to these fundamental issues [@diaz2018stochastic]. The `txshift` `R`
38+
package provides researchers in (bio)statistics, epidemiology, health policy,
39+
economics, and related disciplines with access to state-of-the-art statistical
40+
methodology for evaluating the causal effects of stochastic shift interventions
41+
on _continuous-valued_ exposures. `txshift` estimates the causal effects of
42+
modified treatment policies (or "feasible interventions"), which take into
43+
account the natural value of an exposure in assigning an intervention level. To
44+
accommodate use in study designs incorporating outcome-dependent two-phase
45+
sampling (e.g., case-control), the package provides two types of modern
46+
corrections, both rooted in semiparametric theory, for constructing unbiased and
47+
efficient estimates, despite the significant limitations induced by such
48+
designs. Thus, `txshift` makes possible the estimation of the causal effects of
49+
stochastic interventions in experimental and observational study settings
50+
subject to real-world design limitations that commonly arise in modern
51+
scientific practice.
52+
53+
# Statement of Need
54+
55+
Researchers seeking to build upon or apply cutting-edge statistical approaches
56+
for causal inference often face significant obstacles: such methods are usually
57+
not accompanied by robust, well-tested, and well-documented software packages.
58+
Yet coding such methods from scratch is often impractical for the applied
59+
researcher, as understanding the theoretical underpinnings of these methods
60+
requires advanced training, severely complicating the assessment and testing of
61+
bespoke causal inference software. What's more, even when such software tools
62+
exist, they are usually minimal implementations, providing support only for
63+
deploying the statistical method in problem settings untouched by the
64+
complexities of real-world data. The `txshift` `R` package solves this problem
65+
by providing an open source tool for evaluating the causal effects of flexible,
66+
stochastic interventions, applicable to categorical or continuous-valued
67+
exposures, while providing corrections for appropriately handling data generated
68+
by commonly used but complex two-phase sampling designs.
4669

4770
# Background
4871

@@ -74,7 +97,7 @@ causal effects under general two-phase sampling designs.
7497
Building on these prior works, @hejazi2020efficient outlined a novel approach
7598
for use in such settings: augmented targeted minimum loss (TML) and one-step
7699
estimators for the causal effects of stochastic interventions, with guarantees
77-
of consistency, efficiency, and multiple robustness even in the presence of
100+
of consistency, efficiency, and multiple robustness despite the presence of
78101
two-phase sampling. These authors further outlined a technique that summarizes
79102
the effect of shifting an exposure variable on the outcome of interest via
80103
a nonparametric working marginal structural model, analogous to a dose-response
@@ -86,20 +109,20 @@ estimators of the causal effects of modified treatment policies that shift the
86109
observed exposure value up (or down) by an arbitrary scalar $\delta$, which may
87110
possibly take into account the natural value of the exposure (and, in future
88111
versions, the covariates). The `R` package includes tools for deploying these
89-
efficient estimators under two-phase sampling designs, with two types of
90-
corrections: (1) a reweighting procedure that introduces inverse probability of
91-
censoring weights directly into relevant loss functions, as discussed in
92-
@rose2011targeted2sd; as well as (2) an augmented efficient influence function
93-
estimating equation, studied more thoroughly by @hejazi2020efficient. `txshift`
94-
integrates with the [`sl3` package](https://github.com/tlverse/sl3)
95-
[@coyle2020sl3] to allow for ensemble machine learning to be leveraged in the
96-
estimation of nuisance parameters. What's more, the `txshift` package draws on
97-
both the `hal9001` [@coyle2020hal9001; @hejazi2020hal9001] and `haldensify`
98-
[@hejazi2020haldensify] `R` packages to allow each of the efficient estimators
99-
to be constructed in a manner consistent with the methodological and theoretical
100-
advances of @hejazi2020efficient, which require fast convergence rates of
101-
nuisance parameters to their true counterparts for efficiency of the resultant
102-
estimator.
112+
efficient estimators under outcome-dependent two-phase sampling designs, with
113+
two types of corrections: (1) a reweighting procedure that introduces inverse
114+
probability of censoring weights directly into relevant loss functions, as
115+
discussed in @rose2011targeted2sd; as well as (2) an augmented efficient
116+
influence function estimating equation, studied more thoroughly by
117+
@hejazi2020efficient. `txshift` integrates with the [`sl3`
118+
package](https://github.com/tlverse/sl3) [@coyle2020sl3] to allow for ensemble
119+
machine learning to be leveraged in the estimation of nuisance parameters.
120+
What's more, the `txshift` package draws on both the `hal9001`
121+
[@coyle2020hal9001; @hejazi2020hal9001] and `haldensify` [@hejazi2020haldensify]
122+
`R` packages to allow each of the efficient estimators to be constructed in
123+
a manner consistent with the methodological and theoretical advances of
124+
@hejazi2020efficient, which require fast convergence rates of nuisance
125+
parameters to their true counterparts for efficiency of the resultant estimator.
103126

104127
# Availability
105128

0 commit comments

Comments
 (0)