Skip to content

Commit ac897b0

Browse files
committed
intmira
1 parent c2b7230 commit ac897b0

14 files changed

+2371
-0
lines changed

Data/missingdata/Flegal2016.RData

158 KB
Binary file not shown.
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
{
2+
"hash": "3ba318c24c7cac1879fc47ccec89e8cf",
3+
"result": {
4+
"engine": "knitr",
5+
"markdown": "## Testing Interactions {.unnumbered}\n\nWe will use the article. We will use the following article by [Flegal et al. (2016)](https://jamanetwork.com/journals/jama/article-abstract/2526639). We will use the same data as [before](surveydataEsolution.html).\n\n\n**Objective**: To statistically test whether an interaction term contributes significantly to a model when using complex survey data (NHANES) that has been imputed using `mice`.\n\nThe Scenario:\n\nOutcome ($Y$): BMI (`BMXBMI`) \nPredictor ($X$): Age (`RIDAGEYR`) \nModerator ($Z$): Smoking Status (`SMQ020`)\n\n**Question**: Does the effect of Age on BMI depend on whether the person smokes?\n\n\n::: {.cell}\n\n```{.r .cell-code}\nload(\"Data/missingdata/Flegal2016.RData\")\nls()\n#> [1] \"dat.full\"\nstr(dat.full)\n#> 'data.frame':\t10175 obs. of 12 variables:\n#> $ SEQN : 'labelled' int 73557 73558 73559 73560 73561 73562 73563 73564 73565 73566 ...\n#> ..- attr(*, \"label\")= chr \"Respondent sequence number\"\n#> $ RIDAGEYR: 'labelled' int 69 54 72 9 73 56 0 61 42 56 ...\n#> ..- attr(*, \"label\")= chr \"Age in years at screening\"\n#> $ RIAGENDR: Factor w/ 2 levels \"Male\",\"Female\": 1 1 1 1 2 1 1 2 1 2 ...\n#> $ DMDEDUC2: Factor w/ 7 levels \"Less than 9th grade\",..: 3 3 4 NA 5 4 NA 5 3 3 ...\n#> $ RIDRETH3: Factor w/ 6 levels \"Mexican American\",..: 4 3 3 3 3 1 3 3 2 3 ...\n#> $ RIDEXPRG: Factor w/ 3 levels \"Yes, positive lab pregnancy test\",..: NA NA NA NA NA NA NA NA NA NA ...\n#> $ WTINT2YR: 'labelled' num 13281 23682 57215 55201 63710 ...\n#> ..- attr(*, \"label\")= chr \"Full sample 2 year interview weight\"\n#> $ SDMVPSU : 'labelled' int 1 1 1 2 2 1 1 1 2 1 ...\n#> ..- attr(*, \"label\")= chr \"Masked variance pseudo-PSU\"\n#> $ SDMVSTRA: 'labelled' int 112 108 109 109 116 111 105 114 106 112 ...\n#> ..- attr(*, \"label\")= chr \"Masked variance pseudo-stratum\"\n#> $ BMXBMI : 'labelled' num 26.7 28.6 28.9 17.1 19.7 41.7 NA 35.7 NA 26.5 ...\n#> ..- attr(*, \"label\")= chr \"Body Mass Index (kg/m**2)\"\n#> $ SMQ020 : Factor w/ 3 levels \"Yes\",\"No\",\"Don't know\": 1 1 1 NA 2 1 NA 2 1 1 ...\n#> $ SMQ040 : Factor w/ 3 levels \"Every day\",\"Some days\",..: 3 2 3 NA NA 3 NA NA 3 1 ...\n```\n:::\n\n\n# Step 1: Imputation\n\nBefore modeling, we must handle missing data to avoid bias.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(mice) \n#> \n#> Attaching package: 'mice'\n#> The following object is masked from 'package:stats':\n#> \n#> filter\n#> The following objects are masked from 'package:base':\n#> \n#> cbind, rbind\nlibrary(survey)\n#> Loading required package: grid\n#> Loading required package: Matrix\n#> Loading required package: survival\n#> \n#> Attaching package: 'survey'\n#> The following object is masked from 'package:graphics':\n#> \n#> dotchart\n# 1. Impute the data\n# Recommendation: Set m = 20 for interaction tests to stabilize variance.\nimp <- mice(dat.full, m = 20, maxit = 5)\n#> \n#> iter imp variable\n#> 1 1 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 2 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 3 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 4 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 5 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 6 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 7 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 8 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 9 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 10 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 11 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 12 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 13 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 14 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 15 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 16 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 17 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 18 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 19 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 1 20 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 1 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 2 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 3 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 4 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 5 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 6 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 7 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 8 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 9 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 10 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 11 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 12 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 13 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 14 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 15 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 16 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 17 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 18 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 19 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 2 20 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 1 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 2 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 3 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 4 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 5 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 6 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 7 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 8 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 9 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 10 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 11 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 12 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 13 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 14 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 15 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 16 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 17 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 18 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 19 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 3 20 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 1 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 2 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 3 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 4 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 5 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 6 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 7 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 8 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 9 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 10 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 11 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 12 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 13 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 14 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 15 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 16 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 17 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 18 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 19 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 4 20 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 1 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 2 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 3 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 4 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 5 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 6 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 7 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 8 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 9 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 10 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 11 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 12 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 13 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 14 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 15 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 16 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 17 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 18 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 19 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> 5 20 DMDEDUC2 RIDEXPRG BMXBMI SMQ020 SMQ040\n#> Warning: Number of logged events: 200\n\n# 2. Extract the data into a list\n# We cannot use 'with()' easily with svydesign, so we create a list of\n# completed dataframes to loop over manually.\nimp_list <- complete(imp, \"all\")\n```\n:::\n\n\n# Step 2: Fitting the Nested Models\n\nTo test an interaction, we must compare two models:\n\nThe Full Model: Includes the interaction term (Age * Smoking). \nThe Reduced Model: Includes only main effects (Age + Smoking).\n\nWe use lapply to fit these models across all 20 imputed datasets.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# --- Helper: Calculate Design Degrees of Freedom (dfcom) ---\n# We need this to prevent NaN errors in the final test.\n# It is usually (Number of PSUs - Number of Strata).\n# We calculate it once using the first imputed dataset.\ntemp_des <- svydesign(id = ~SDMVPSU, \n strata = ~SDMVSTRA, \n weights = ~WTINT2YR, \n data = imp_list[[1]], \n nest = TRUE)\ndf_degrees_freedom <- degf(temp_des)\n# May need to do subsetting if the original data was reduced.\n```\n:::\n\n\n\nModel A: The Full Model (With Interaction) \n\n\n::: {.cell}\n\n```{.r .cell-code}\nfit_full_list <- lapply(imp_list, function(df) {\n # 1. Define survey design for this iteration\n des <- svydesign(id = ~SDMVPSU, \n strata = ~SDMVSTRA, \n weights = ~WTINT2YR, \n data = df, \n nest = TRUE)\n \n # 2. Fit the GLM\n svyglm(BMXBMI ~ RIDAGEYR * SMQ020, design = des)\n})\n```\n:::\n\n\nModel B: The Reduced Model (No Interaction)\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfit_reduced_list <- lapply(imp_list, function(df) {\n des <- svydesign(id = ~SDMVPSU, \n strata = ~SDMVSTRA, \n weights = ~WTINT2YR, \n data = df, nest = TRUE)\n \n svyglm(BMXBMI ~ RIDAGEYR + SMQ020, design = des)\n})\n```\n:::\n\n\n# Step 3: Pooling and Testing\n\nWe convert our lists of models into mira objects (Multiply Imputed Repeated Analyses) so the mice package recognizes them. Then, we use the Wald Test ($D_1$).\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# 1. Convert to 'mira' class\nfit_full <- list(analyses = fit_full_list)\nclass(fit_full) <- \"mira\"\n\nfit_reduced <- list(analyses = fit_reduced_list)\nclass(fit_reduced) <- \"mira\"\n\n# 2. Run the D1 Statistic (Multivariate Wald Test)\n# We pass the dfcom we calculated earlier to ensure stability\ntest_result <- D1(fit_full, \n fit_reduced, \n dfcom = df_degrees_freedom)\n```\n:::\n\n\n\n# Step 4: Detailed Breakdown of the Output\n\nWhen you run `print(test_result)`, you get a row of statistics. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nnames(test_result)\n#> [1] \"call\" \"result\" \"formulas\" \"m\" \"method\" \"use\" \"dfcom\"\ntest_result$dfcom\n#> [1] 15\nlibrary(kableExtra)\nkable(test_result$result)\n```\n\n::: {.cell-output-display}\n\n\n| F.value| df1| df2| P(>F)| RIV|\n|---------:|---:|--------:|---------:|--------:|\n| 0.5023733| 2| 2.625538| 0.6535277| 2.959959|\n\n\n:::\n:::\n\n\nThe Definitions\n\nMetric | What it is | How to Interpret \n------ | ----------- | ----------------- \ntest statistic | The F-value of the Wald test. | High values (> 4) suggest the models are significantly different. Values near 1 suggest the interaction adds nothing. \ndf1 | Numerator Degrees of Freedom. | Number of parameters added. \ndfcom | Complete Data Degrees of Freedom. | From survey design. Low dfcom (< 30) means limited power. \nriv | Relative Increase in Variance. | Low (< 0.5) means stable imputations. High (> 1) means large uncertainty. \ndf2 | Denominator Degrees of Freedom. | Effective sample size after MI and design. \np.value | Significance Level. | < 0.05 indicates significant interaction. \n\n# Step 5: How to Report This\n\n## Scenario A: The Interaction Significant (p < 0.05)\n\n\"To test whether the association between age and BMI was modified by smoking status, we compared nested survey-weighted regression models using the multiparameter Wald test ($D_1$) on 20 imputed datasets. We found a significant interaction between age and smoking ($F(2, 350.5) = 4.52, p = 0.01$). Consequently, we present stratified results for smokers and non-smokers.\"\n\n## Scenario B: The Interaction NOT Significant (p > 0.05)\n\n\"We tested for effect modification by smoking status using the $D_1$ Wald statistic. The interaction term was not statistically significant ($F(2, 105.2) = 0.88, p = 0.48$). Therefore, the interaction term was removed, and we proceeded with the main effects model adjusted for smoking status.\"\n",
6+
"supporting": [],
7+
"filters": [
8+
"rmarkdown/pagebreak.lua"
9+
],
10+
"includes": {
11+
"include-in-header": [
12+
"<link href=\"site_libs/pagedtable-1.1/css/pagedtable.css\" rel=\"stylesheet\" />\n<script src=\"site_libs/pagedtable-1.1/js/pagedtable.js\"></script>\n"
13+
]
14+
},
15+
"engineDependencies": {},
16+
"preserve": {},
17+
"postProcess": true
18+
}
19+
}

_quarto.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,7 @@ book:
146146
- missingdata2.qmd
147147
- missingdata3.qmd
148148
- missingdata4.qmd
149+
- missingdata4i.qmd
149150
- missingdata5.qmd
150151
- missingdata6.qmd
151152
- missingdata7.qmd

0 commit comments

Comments
 (0)