Skip to content

Commit a863252

Browse files
committed
fix typo add full stops
Signed-off-by: Nathaniel <[email protected]>
1 parent 9fc57a4 commit a863252

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

docs/source/notebooks/inv_prop_pymc.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@
2222
"\n",
2323
"In this notebook we will briefly demonstrate how to use propensity score weighting schemes to recover treatment effects in the analysis of observational data. We will first showcase the method with a simulated data example drawn from Lucy D’Agostino McGowan's [excellent blog](https://livefreeordichotomize.com/posts/2019-01-17-understanding-propensity-score-weighting/) on inverse propensity score weighting. Then we shall apply the same techniques to NHEFS data set discussed in Miguel Hernan and Robins' _Causal Inference: What if_ [book](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/). This data set measures the effect of quitting smoking between the period of 1971 and 1982. At each of these two points in time the participant's weight was recorded, and we seek to estimate the effect of quitting in the intervening years on the weight recorded in 1982.\n",
2424
"\n",
25-
"We will use inverse propensity score weighting techniques to estimate the average treatment effect. There are a range of weighting techniques available: we have implemented `raw`, `robust`, `doubly robust` and `overlap` weighting schemes all of which aim to estimate the average treatment effect. The idea of a propensity score (very broadly) is to derive a one-number summary of individual's probability of adopting a particular treatment. This score is typically calculated by fitting a predictive logit model on all an individual's observed attributes predicting whether or not the those attributes drive the indivudal towards the treatment status. In the case of the NHEFS data we want a model to measure the propensity for each individual to quit smoking. \n",
25+
"We will use inverse propensity score weighting techniques to estimate the average treatment effect. There are a range of weighting techniques available: we have implemented `raw`, `robust`, `doubly robust` and `overlap` weighting schemes all of which aim to estimate the average treatment effect. The idea of a propensity score (very broadly) is to derive a one-number summary of individual's probability of adopting a particular treatment. This score is typically calculated by fitting a predictive logit model on all an individual's observed attributes predicting whether or not the those attributes drive the indivdual towards the treatment status. In the case of the NHEFS data we want a model to measure the propensity for each individual to quit smoking. \n",
2626
"\n",
27-
"The reason we want this propensity score is because with observed data we often have a kind of imbalance in our covariate profiles across treatment groups. Meaning our data might be unrepresentative in some crucial aspect. This prevents us cleanly reading off treatment effects by looking at simple group differences. These \"imbalances\" can be driven by selection effects into the treatment status so that if we want to estimate the average treatment effect in the population as a whole we need to be wary that our sample might not give us generalisable insight into the treatment differences. Using propensity scores as a measure of the prevalance to adopt the treatment status in the population, we can cleverly weight the observed data to privilege observations of \"rare\" occurence in each group. For example, if smoking is the treatment status and regular running is generally not common among the group of smokers, then on the occasion we see a smoker marathon runner we should heavily weight their outcome measure to overcome their low prevalence in the treated group but real presence in the unmeasured population. Inverse propensity weighting tries to define weighting schemes are inversely proportional to an individual's propensity score so as to better recover an estimate which mitigates (somewhat) the risk of selection effect bias. For more details and illustration of these themes see the PyMC examples [write up](https://www.pymc.io/projects/examples/en/latest/causal_inference/bayesian_nonparametric_causal.html) on Non-Parametric Bayesian methods {cite:p}`forde2024nonparam`\n"
27+
"The reason we want this propensity score is because with observed data we often have a kind of imbalance in our covariate profiles across treatment groups. Meaning our data might be unrepresentative in some crucial aspect. This prevents us cleanly reading off treatment effects by looking at simple group differences. These \"imbalances\" can be driven by selection effects into the treatment status so that if we want to estimate the average treatment effect in the population as a whole we need to be wary that our sample might not give us generalisable insight into the treatment differences. Using propensity scores as a measure of the prevalance to adopt the treatment status in the population, we can cleverly weight the observed data to privilege observations of \"rare\" occurence in each group. For example, if smoking is the treatment status and regular running is generally not common among the group of smokers, then on the occasion we see a smoker marathon runner we should heavily weight their outcome measure to overcome their low prevalence in the treated group but real presence in the unmeasured population. Inverse propensity weighting tries to define weighting schemes are inversely proportional to an individual's propensity score so as to better recover an estimate which mitigates (somewhat) the risk of selection effect bias. For more details and illustration of these themes see the PyMC examples [write up](https://www.pymc.io/projects/examples/en/latest/causal_inference/bayesian_nonparametric_causal.html) on Non-Parametric Bayesian methods. {cite:p}`forde2024nonparam`\n"
2828
]
2929
},
3030
{
@@ -705,7 +705,7 @@
705705
"source": [
706706
"## Evaluating Balance\n",
707707
"\n",
708-
"One of the main criteria for success with the estimation of propensity scores is to check how balanced the covariate profiles of our data are across the treatment status under different re-weighting schemes. A good balance of the covariate values across the treatment status is suggestive of the requirement that assignment to a treatment status should be as good as random when conditional on the covariate profile $X$. That is to say, the condition of __strong ignorability__ holds when the treatment status $T$ is independent of the propensity $p(X)$ conditional on $X$\n",
708+
"One of the main criteria for success with the estimation of propensity scores is to check how balanced the covariate profiles of our data are across the treatment status under different re-weighting schemes. A good balance of the covariate values across the treatment status is suggestive of the requirement that assignment to a treatment status should be as good as random when conditional on the covariate profile $X$. That is to say, the condition of __strong ignorability__ holds when the treatment status $T$ is independent of the propensity $p(X)$ conditional on $X$.\n",
709709
"\n",
710710
"One visual way to analyse this balance is to look at the empirical cumulative distribution function for each covariate conditional on the different realisation of propensity scores under the different weighting schemes. We can contrast the difference in the shapes of the ECDFs using the following function."
711711
]

0 commit comments

Comments
 (0)