Skip to content

Commit 1c28cee

Browse files
committed
version 0.4.0
1 parent d4f2b8e commit 1c28cee

File tree

9 files changed

+73
-52
lines changed

9 files changed

+73
-52
lines changed

DESCRIPTION

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,12 @@
11
Package: svyTable1
22
Title: Create Survey-Weighted Descriptive Statistics Tables
3-
Version: 0.3.0
4-
Author: Ehsan Karim <[email protected]>
5-
Maintainer: Ehsan Karim <[email protected]>
3+
Version: 0.4.0
4+
Authors@R: c(person("Ehsan", "Karim",
5+
email = "[email protected]",
6+
role = c("aut", "cre")),
7+
person("Esteban", "Valencia",
8+
comment = "Provided feedback on generalizing the svydiag function, tested installation issues and fixed a bug regarding effective sample size calculation.",
9+
role = "ctb"))
610
Description: A simple tool to create 'Table 1' summaries from complex
711
survey data, with options for weighted, unweighted, and mixed displays.
812
License: MIT + file LICENSE

NAMESPACE

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
# Generated by roxygen2: do not edit by hand
22

33
export("%>%")
4-
export(svyglmdiag)
4+
export(svydiag)
55
export(svytable1)
66
import(stats)
7-
importFrom(dplyr,bind_cols)
87
importFrom(dplyr,mutate)
98
importFrom(dplyr,select)
109
importFrom(magrittr,"%>%")
10+
importFrom(stats,coef)
1111
importFrom(stats,confint)
12-
importFrom(stats,setNames)
12+
importFrom(stats,vcov)
1313
importFrom(survey,SE)
1414
importFrom(survey,degf)
1515
importFrom(survey,svyby)
1616
importFrom(survey,svyciprop)
1717
importFrom(survey,svymean)
1818
importFrom(survey,svytable)
1919
importFrom(survey,svyvar)
20-
importFrom(tibble,as_tibble)
20+
importFrom(tibble,tibble)

NEWS.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# svyTable1 0.4.0
2+
3+
## MAJOR IMPROVEMENTS
4+
5+
* `svyglmdiag()` has been renamed to the more general `svydiag()` and now
6+
supports additional models like `svycoxph`; a general installation issue
7+
and a bug regarding effective sample size calculation is
8+
fixed. (Thanks to Esteban Valencia)

R/svyglmdiag.R renamed to R/svydiag.R

Lines changed: 28 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
#' Perform Reliability Diagnostics on Survey Regression Models
22
#'
33
#' @description
4-
#' This function takes a fitted survey regression model object (e.g., from `svyglm`)
5-
#' and produces a tibble with key reliability and diagnostic metrics for each
6-
#' coefficient.
4+
#' This function takes a fitted survey regression model object (e.g., from `svyglm`
5+
#' or `svycoxph`) and produces a tibble with key reliability and diagnostic
6+
#' metrics for each coefficient.
77
#'
88
#' @details
99
#' The output provides a comprehensive overview to help assess the stability and
@@ -21,7 +21,7 @@
2121
#' even if precisely estimated. It is better to rely on the standard error,
2222
#' p-value, and confidence interval width for reliability assessment.
2323
#'
24-
#' @param fit A fitted model object, typically of class `svyglm`.
24+
#' @param fit A fitted model object from the `survey` package, such as `svyglm` or `svycoxph`.
2525
#' @param p_threshold A numeric value (between 0 and 1) for the significance threshold. Defaults to `0.05`.
2626
#' @param rse_threshold A numeric value for flagging high Relative Standard Error (RSE). Defaults to `30`.
2727
#'
@@ -40,9 +40,9 @@
4040
#' \item \code{is_rse_high}: A logical flag, `TRUE` if `RSE_percent` is greater than or equal to `rse_threshold`.
4141
#' }
4242
#'
43-
#' @importFrom dplyr mutate bind_cols select
44-
#' @importFrom tibble as_tibble
45-
#' @importFrom stats confint setNames
43+
#' @importFrom dplyr mutate select
44+
#' @importFrom tibble tibble
45+
#' @importFrom stats confint coef vcov
4646
#'
4747
#' @export
4848
#'
@@ -82,8 +82,8 @@
8282
#' family = quasibinomial()
8383
#' )
8484
#'
85-
#' # 3. Get the reliability diagnostics table
86-
#' diagnostics_table <- svyglmdiag(fit)
85+
#' # 3. Get the reliability diagnostics table using the new function
86+
#' diagnostics_table <- svydiag(fit)
8787
#'
8888
#' # Print the resulting table
8989
#' print(diagnostics_table)
@@ -96,25 +96,31 @@
9696
#' }
9797
#' }
9898

99-
svyglmdiag <- function(fit, p_threshold = 0.05, rse_threshold = 30) {
99+
svydiag <- function(fit, p_threshold = 0.05, rse_threshold = 30) {
100100

101-
# --- Input validation ---
102-
if (!inherits(fit, "svyglm")) {
103-
warning("This function is designed for 'svyglm' objects. Results may be unexpected.")
104-
}
101+
# 1. Robustly extract key model components using accessor functions
102+
s_fit <- summary(fit)
103+
estimates <- stats::coef(fit)
104+
se <- sqrt(diag(stats::vcov(fit)))
105+
conf_int <- stats::confint(fit)
105106

106-
# 1. Get the standard model summary and confidence intervals
107-
summary_fit <- summary(fit)
108-
conf_int_fit <- stats::confint(fit)
107+
# P-values are most reliably extracted from the summary coefficient table.
108+
# This assumes the p-value is the last column, which is standard for most
109+
# survey models (svyglm, svycoxph, etc.).
110+
p_vals <- s_fit$coefficients[, ncol(s_fit$coefficients)]
109111

110112
# 2. Combine these into a single, informative table
111-
reliability_df <- tibble::as_tibble(summary_fit$coefficients, rownames = "Term")
112-
names(reliability_df) <- c("Term", "Estimate", "SE", "t.value", "p.value")
113+
reliability_df <- tibble::tibble(
114+
Term = names(estimates),
115+
Estimate = estimates,
116+
SE = se,
117+
p.value = p_vals,
118+
CI_Lower = conf_int[, 1],
119+
CI_Upper = conf_int[, 2]
120+
)
113121

114-
# 3. Add CIs, calculate metrics, and add flags
122+
# 3. Calculate derived metrics, add flags, and finalize the output
115123
reliability_df <- reliability_df %>%
116-
dplyr::bind_cols(tibble::as_tibble(conf_int_fit) %>%
117-
stats::setNames(c("CI_Lower", "CI_Upper"))) %>%
118124
dplyr::mutate(
119125
RSE_percent = (SE / abs(Estimate)) * 100,
120126
CI_Width = CI_Upper - CI_Lower,

R/svytable1.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
#' Data Presentation Standards.
88
#'
99
#' @param design A survey design object created by the survey package.
10-
#' @param strata_var A string with the name of the stratification variable.
10+
#' @param strata_var A string with the name of the stratification variable. If this variable contains NA values, they will be automatically grouped into a separate 'Missing' stratum in the output table.
1111
#' @param table_vars A character vector of variable names to summarize.
1212
#' @param mode A string specifying the output type: "mixed" (default), "weighted", or "unweighted".
1313
#' @param commas Logical; if TRUE (default), large numbers in counts are formatted with commas.
@@ -229,7 +229,7 @@ svytable1 <- function(design, strata_var, table_vars,
229229
ci_low <- metrics$ci_low[level_index]; ci_high <- metrics$ci_high[level_index]
230230
pct_val <- metrics$prop[level_index]; se <- metrics$se[level_index]
231231

232-
effective_n <- if(!is.na(deff) && deff > 0) n / deff else 0
232+
effective_n <- if(!is.na(deff)) n / max(1, deff) else 0
233233
ciw <- ci_high - ci_low
234234
rciw <- if(!is.na(pct_val) && pct_val > 0) (ciw / pct_val) * 100 else Inf
235235
rse <- if(!is.na(pct_val) && pct_val > 0) (se / pct_val) * 100 else Inf

README.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ The package was developed to simplify a common task in epidemiology and public h
1515
- **Built-in Reliability Checks:** Automatically apply NCHS Data Presentation Standards for Proportions to flag or suppress unreliable estimates.
1616
- **Flexible Output Modes:** Easily switch between `"mixed"`, `"weighted"`, and `"unweighted"` summaries.
1717
- **Readability:** Option to format large numbers with commas for improved readability.
18-
- **Regression Diagnostics**: Includes the `svyglmdiag()` helper function to assess the reliability of coefficients from `svyglm()` models.
18+
- **Regression Diagnostics**: Includes the `svydiag()` helper function to assess the reliability of coefficients from `svyglm()` models.
1919

2020
---
2121

@@ -25,7 +25,8 @@ You can install the development version of **svyTable1** from GitHub with:
2525

2626
```r
2727
# install.packages("devtools")
28-
devtools::install_github("ehsanx/svyTable1", build_vignettes = TRUE)
28+
# In README.md
29+
devtools::install_github("ehsanx/svyTable1", build_vignettes = TRUE, dependencies = TRUE)
2930
```
3031

3132
---
@@ -163,7 +164,7 @@ Example `svytable1` output table from Example C with the reliability checks appl
163164

164165
#### Example D: Reliability Checks for Regression Models
165166

166-
Beyond descriptive tables, the package provides `svyglmdiag()` to assess the reliability of coefficients from a survey-weighted regression model. It calculates key metrics like p-values, standard errors, and confidence interval widths.
167+
Beyond descriptive tables, the package provides `svydiag()` to assess the reliability of coefficients from a survey-weighted regression model. It calculates key metrics like p-values, standard errors, and confidence interval widths.
167168

168169
```r
169170
# 1. Fit a logistic regression model using the complete-case design
@@ -174,7 +175,7 @@ fit_obesity <- svyglm(
174175
)
175176

176177
# 2. Get the reliability diagnostics table for the model
177-
diagnostics_table <- svyglmdiag(fit_obesity)
178+
diagnostics_table <- svydiag(fit_obesity)
178179

179180
# 3. Display the diagnostics table
180181
knitr::kable(
@@ -187,7 +188,7 @@ knitr::kable(
187188

188189
## 📊 Example Output 2
189190

190-
Example output table for Example D, which demonstrates the `svyglmdiag()` function.
191+
Example output table for Example D, which demonstrates the `svydiag()` function.
191192

192193
|Term | Estimate| SE| p.value|is_significant | CI_Lower| CI_Upper| CI_Width| RSE_percent|is_rse_high |
193194
|:---|---:|---:|---:|:---|---:|---:|---:|---:|:---|

man/svyglmdiag.Rd renamed to man/svydiag.Rd

Lines changed: 10 additions & 10 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/svytable1.Rd

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

vignettes/using-svyTable1.Rmd

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,8 @@ knitr::kable(
9191
)
9292
```
9393

94+
It's important to note that `svytable1()` automatically detects missing (`NA`) values in the stratification variable. It treats these observations as a distinct group, creating a separate 'Missing' column in the table to ensure all data is accounted for.
95+
9496
#### Example B: Summarizing Complete Data
9597

9698
```{r}
@@ -231,13 +233,13 @@ An RSE of **30%** has historically been a common cutoff for determining if an es
231233

232234
## Extending Reliability Checks to Regression Models
233235

234-
While `svytable1()` focuses on descriptive statistics, a common next step in analysis is fitting a regression model. Assessing the reliability of regression coefficients is just as important as checking descriptive estimates. To support this workflow, the `svyTable1` package now includes `svyglmdiag()`, a helper function for diagnosing the stability of coefficients from `svyglm()` models.
236+
While `svytable1()` focuses on descriptive statistics, a common next step in analysis is fitting a regression model. Assessing the reliability of regression coefficients is just as important as checking descriptive estimates. To support this workflow, the `svyTable1` package now includes `svydiag()`, a helper function for diagnosing the stability of coefficients from `svyglm()` models.
235237

236238
The function provides key metrics recommended for this purpose, such as the **p-value**, **standard error**, and **confidence interval width**. It also includes the Relative Standard Error (RSE) for comparison, though it is not the recommended primary check for regression coefficients due to its tendency to be misleading for estimates near zero.
237239

238240
### Example: Running Diagnostics on a Survey-Weighted Model
239241

240-
Let's fit a logistic regression model to predict obesity (`ObeseStatus`) using the complete-case NHANES data we prepared earlier. We'll then use `svyglmdiag()` to assess the reliability of our model's coefficients.
242+
Let's fit a logistic regression model to predict obesity (`ObeseStatus`) using the complete-case NHANES data we prepared earlier. We'll then use `svydiag()` to assess the reliability of our model's coefficients.
241243

242244
```{r reg-diagnostics}
243245
# 1. Fit a survey-weighted logistic regression model
@@ -249,7 +251,7 @@ fit_obesity <- svyglm(
249251
)
250252
251253
# 2. Run the diagnostics function on the fitted model
252-
diagnostics_table <- svyglmdiag(fit_obesity)
254+
diagnostics_table <- svydiag(fit_obesity)
253255
254256
# 3. Display the diagnostics table
255257
knitr::kable(
@@ -261,7 +263,7 @@ knitr::kable(
261263

262264
## Interpreting the Regression Diagnostics Table
263265

264-
The output from `svyglmdiag()` provides a clear, term-by-term *report card* for your regression model. It helps you evaluate the reliability and interpretability of each coefficient.
266+
The output from `svydiag()` provides a clear, term-by-term *report card* for your regression model. It helps you evaluate the reliability and interpretability of each coefficient.
265267

266268
### Key Columns
267269

@@ -283,7 +285,7 @@ The output from `svyglmdiag()` provides a clear, term-by-term *report card* for
283285

284286
---
285287

286-
Overall, `svyglmdiag()` complements traditional regression output by making it easier to identify which predictors are both statistically meaningful and statistically stable under complex survey design.
288+
Overall, `svydiag()` complements traditional regression output by making it easier to identify which predictors are both statistically meaningful and statistically stable under complex survey design.
287289

288290

289291
## References

0 commit comments

Comments
 (0)