@@ -23,26 +23,49 @@ affiliations:
2323 index : 2
2424 - name : Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University
2525 index : 3
26- date : 28 September 2020
26+ date : 06 October 2020
2727bibliography : refs.bib
2828---
2929
3030# Summary
3131
32- The ` txshift ` ` R ` package aims to provide researchers in (bio)statistics,
33- epidemiology, health policy, econometrics, and related disciplines with access
34- to state-of-the-art statistical methodology for evaluating the causal effects of
35- stochastic shift interventions on _ continuous-valued_ exposures. ` txshift `
36- estimates the causal effects of modified treatment policies (or "feasible
37- interventions"), which take into account the natural value of an exposure in
38- assigning an intervention level. To accommodate use in study designs
39- incorporating two-phase sampling (e.g., case-control), the package provides two
40- types of modern corrections, both rooted in semiparametric theory, for
41- constructing unbiased and efficient estimates, despite the significant
42- limitations induced by such designs. Thus, ` txshift ` makes possible the
43- estimation of the causal effects of stochastic interventions in experimental and
44- observational study settings subject to real-world design limitations that
45- commonly arise in modern scientific practice.
32+ Statistical causal inference has traditionally focused on effects defined by
33+ inflexible static interventions, applicable only to binary or categorical
34+ exposures. The evaluation of such interventions is often plagued by many
35+ problems, both theoretical (e.g., non-identification) and practical (e.g.,
36+ positivity violations); however, stochastic interventions provide a promising
37+ solution to these fundamental issues [ @diaz2018stochastic ] . The ` txshift ` ` R `
38+ package provides researchers in (bio)statistics, epidemiology, health policy,
39+ economics, and related disciplines with access to state-of-the-art statistical
40+ methodology for evaluating the causal effects of stochastic shift interventions
41+ on _ continuous-valued_ exposures. ` txshift ` estimates the causal effects of
42+ modified treatment policies (or "feasible interventions"), which take into
43+ account the natural value of an exposure in assigning an intervention level. To
44+ accommodate use in study designs incorporating outcome-dependent two-phase
45+ sampling (e.g., case-control), the package provides two types of modern
46+ corrections, both rooted in semiparametric theory, for constructing unbiased and
47+ efficient estimates, despite the significant limitations induced by such
48+ designs. Thus, ` txshift ` makes possible the estimation of the causal effects of
49+ stochastic interventions in experimental and observational study settings
50+ subject to real-world design limitations that commonly arise in modern
51+ scientific practice.
52+
53+ # Statement of Need
54+
55+ Researchers seeking to build upon or apply cutting-edge statistical approaches
56+ for causal inference often face significant obstacles: such methods are usually
57+ not accompanied by robust, well-tested, and well-documented software packages.
58+ Yet coding such methods from scratch is often impractical for the applied
59+ researcher, as understanding the theoretical underpinnings of these methods
60+ requires advanced training, severely complicating the assessment and testing of
61+ bespoke causal inference software. What's more, even when such software tools
62+ exist, they are usually minimal implementations, providing support only for
63+ deploying the statistical method in problem settings untouched by the
64+ complexities of real-world data. The ` txshift ` ` R ` package solves this problem
65+ by providing an open source tool for evaluating the causal effects of flexible,
66+ stochastic interventions, applicable to categorical or continuous-valued
67+ exposures, while providing corrections for appropriately handling data generated
68+ by commonly used but complex two-phase sampling designs.
4669
4770# Background
4871
@@ -74,7 +97,7 @@ causal effects under general two-phase sampling designs.
7497Building on these prior works, @hejazi2020efficient outlined a novel approach
7598for use in such settings: augmented targeted minimum loss (TML) and one-step
7699estimators for the causal effects of stochastic interventions, with guarantees
77- of consistency, efficiency, and multiple robustness even in the presence of
100+ of consistency, efficiency, and multiple robustness despite the presence of
78101two-phase sampling. These authors further outlined a technique that summarizes
79102the effect of shifting an exposure variable on the outcome of interest via
80103a nonparametric working marginal structural model, analogous to a dose-response
@@ -86,20 +109,20 @@ estimators of the causal effects of modified treatment policies that shift the
86109observed exposure value up (or down) by an arbitrary scalar $\delta$, which may
87110possibly take into account the natural value of the exposure (and, in future
88111versions, the covariates). The ` R ` package includes tools for deploying these
89- efficient estimators under two-phase sampling designs, with two types of
90- corrections: (1) a reweighting procedure that introduces inverse probability of
91- censoring weights directly into relevant loss functions, as discussed in
92- @rose2011targeted2sd ; as well as (2) an augmented efficient influence function
93- estimating equation, studied more thoroughly by @ hejazi2020efficient . ` txshift `
94- integrates with the [ ` sl3 ` package ] ( https://github.com/tlverse/sl3 )
95- [ @coyle2020sl3 ] to allow for ensemble machine learning to be leveraged in the
96- estimation of nuisance parameters. What's more, the ` txshift ` package draws on
97- both the ` hal9001 ` [ @ coyle2020hal9001 ; @ hejazi2020hal9001 ] and ` haldensify `
98- [ @hejazi2020haldensify ] ` R ` packages to allow each of the efficient estimators
99- to be constructed in a manner consistent with the methodological and theoretical
100- advances of @ hejazi2020efficient , which require fast convergence rates of
101- nuisance parameters to their true counterparts for efficiency of the resultant
102- estimator.
112+ efficient estimators under outcome-dependent two-phase sampling designs, with
113+ two types of corrections: (1) a reweighting procedure that introduces inverse
114+ probability of censoring weights directly into relevant loss functions, as
115+ discussed in @rose2011targeted2sd ; as well as (2) an augmented efficient
116+ influence function estimating equation, studied more thoroughly by
117+ @ hejazi2020efficient . ` txshift ` integrates with the [ ` sl3 `
118+ package ] ( https://github.com/tlverse/sl3 ) [ @coyle2020sl3 ] to allow for ensemble
119+ machine learning to be leveraged in the estimation of nuisance parameters.
120+ What's more, the ` txshift ` package draws on both the ` hal9001 `
121+ [ @coyle2020hal9001 ; @ hejazi2020hal9001 ] and ` haldensify ` [ @ hejazi2020haldensify ]
122+ ` R ` packages to allow each of the efficient estimators to be constructed in
123+ a manner consistent with the methodological and theoretical advances of
124+ @ hejazi2020efficient , which require fast convergence rates of nuisance
125+ parameters to their true counterparts for efficiency of the resultant estimator.
103126
104127# Availability
105128
0 commit comments