Skip to content

Commit 3f9d347

Browse files
authored
Merge branch 'master' into ridgeline-size
2 parents 06ec51d + 036f7c5 commit 3f9d347

27 files changed

+444
-42
lines changed

DESCRIPTION

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ Authors@R: c(person("Jonah", "Gabry", role = c("aut", "cre"), email = "jsg2201@c
77
person("Tristan", "Mahr", role = "aut"),
88
person("Paul-Christian", "Bürkner", role = "ctb"),
99
person("Martin", "Modrák", role = "ctb"),
10-
person("Malcolm", "Barrett", role = "ctb"))
10+
person("Malcolm", "Barrett", role = "ctb"),
11+
person("Frank", "Weber", role = "ctb"))
1112
Maintainer: Jonah Gabry <[email protected]>
1213
Description: Plotting functions for posterior analysis, MCMC diagnostics,
1314
prior and posterior predictive checks, and other visualizations
@@ -36,6 +37,7 @@ Imports:
3637
tidyselect,
3738
utils
3839
Suggests:
40+
ggfortify,
3941
gridExtra (>= 2.2.1),
4042
hexbin,
4143
knitr (>= 1.16),
@@ -47,6 +49,7 @@ Suggests:
4749
rstantools (>= 1.5.0),
4850
scales,
4951
shinystan (>= 2.3.0),
52+
survival,
5053
testthat (>= 2.0.0),
5154
vdiffr
5255
RoxygenNote: 7.1.1

NAMESPACE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ export(ppc_hist)
117117
export(ppc_intervals)
118118
export(ppc_intervals_data)
119119
export(ppc_intervals_grouped)
120+
export(ppc_km_overlay)
120121
export(ppc_loo_intervals)
121122
export(ppc_loo_pit)
122123
export(ppc_loo_pit_overlay)

NEWS.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,22 @@
88
* Items for next release go here
99
-->
1010

11+
* On the y axis, `ppc_loo_pit_qq(..., compare = "normal")` now plots standard
12+
normal quantiles calculated from the PIT values (instead of the standardized
13+
PIT values). (#240, #243, @fweber144)
14+
15+
* New plotting function `ppc_km_overlay()` for outcome variables that are
16+
right-censored. Empirical CCDF estimates of `yrep` are compared with the
17+
Kaplan-Meier estimate of `y`. (#233, #234, @fweber144)
18+
1119
* CmdStanMCMC objects (from CmdStanR) can now be used with extractor
1220
functions `nuts_params()`, `log_posterior()`, `rhat()`, and
1321
`neff_ratio()`. (#227)
1422

1523
* Size of points and interval lines can set in
1624
`mcmc_intervals(..., outer_size, inner_size, point_size)`. (#215, #228, #229)
1725

18-
* Size of ridgelines can be set in `mcmc_areas_ridges(..., size)`. (#224)
26+
* Size of ridgelines can be set in `mcmc_areas(..., size)` and `mcmc_areas_ridges(..., size)`. (#224)
1927

2028
* `mcmc_areas()` tries to use less blank vertical blank space. (#218, #230)
2129

R/bayesplot-package.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,6 @@ NULL
9898
# release reminders (for devtools)
9999
release_questions <- function() { # nocov start
100100
c(
101-
"Have you reduced the size of the vignettes for CRAN?",
101+
"Have you reduced the size of the vignettes for CRAN?"
102102
)
103103
} # nocov end

R/helpers-ppc.R

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,8 @@ validate_yrep <- function(yrep, y) {
7070
#' Checks that grouping variable has same length as `y` and is either a vector or
7171
#' factor variable.
7272
#'
73-
#' @param group,y The user's `group` object and the `y` object returned by `validate_y()`.
73+
#' @param group,y The user's `group` object and the `y` object returned by
74+
#' `validate_y()`.
7475
#' @return Either throws an error or returns `group` (coerced to a factor).
7576
#' @noRd
7677
validate_group <- function(group, y) {
@@ -88,10 +89,6 @@ validate_group <- function(group, y) {
8889
abort("length(group) must be equal to length(y).")
8990
}
9091

91-
if (length(unique(group)) == 1) {
92-
abort("'group' must have more than one unique value.")
93-
}
94-
9592
unname(group)
9693
}
9794

R/ppc-censoring.R

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
#' PPC censoring
2+
#'
3+
#' @description Compare the empirical distribution of censored data `y` to the
4+
#' distributions of simulated/replicated data `yrep` from the posterior
5+
#' predictive distribution. See the **Plot Descriptions** section, below, for
6+
#' details.
7+
#'
8+
#' Although some of the other plots can be used with censored data,
9+
#' `ppc_km_overlay()` is currently the only plotting function designed
10+
#' *specifically* for censored data. We encourage you to suggest or contribute
11+
#' additional plots at [https://github.com/stan-dev/bayesplot](github.com/stan-dev/bayesplot).
12+
#'
13+
#'
14+
#'
15+
#' @name PPC-censoring
16+
#' @family PPCs
17+
#'
18+
#' @template args-y-yrep
19+
#' @param size,alpha Passed to the appropriate geom to control the appearance of
20+
#' the `yrep` distributions.
21+
#' @param ... Currently unused.
22+
#'
23+
#' @template return-ggplot
24+
#'
25+
#' @section Plot Descriptions:
26+
#' \describe{
27+
#' \item{`ppc_km_overlay()`}{
28+
#' Empirical CCDF estimates of each dataset (row) in `yrep` are overlaid,
29+
#' with the Kaplan-Meier estimate (Kaplan and Meier, 1958) for `y` itself
30+
#' on top (and in a darker shade). This is a PPC suitable for
31+
#' right-censored `y`. Note that the replicated data from `yrep` is assumed
32+
#' to be uncensored.
33+
#' }
34+
#' }
35+
#'
36+
#' @templateVar bdaRef (Ch. 6)
37+
#' @template reference-bda
38+
#' @template reference-km
39+
#'
40+
#' @examples
41+
#' color_scheme_set("brightblue")
42+
#' y <- example_y_data()
43+
#' # For illustrative purposes, (right-)censor values y > 110:
44+
#' status_y <- as.numeric(y <= 110)
45+
#' y <- pmin(y, 110)
46+
#' # In reality, the replicated data (yrep) would be obtained from a
47+
#' # model which takes the censoring of y properly into account. Here,
48+
#' # for illustrative purposes, we simply use example_yrep_draws():
49+
#' yrep <- example_yrep_draws()
50+
#' dim(yrep)
51+
#' \donttest{
52+
#' ppc_km_overlay(y, yrep[1:25, ], status_y = status_y)
53+
#' }
54+
NULL
55+
56+
#' @export
57+
#' @rdname PPC-censoring
58+
#' @param status_y The status indicator for the observations from `y`. This must
59+
#' be a numeric vector of the same length as `y` with values in \{0, 1\} (0 =
60+
#' right censored, 1 = event).
61+
ppc_km_overlay <-
62+
function(y,
63+
yrep,
64+
...,
65+
status_y,
66+
size = 0.25,
67+
alpha = 0.7) {
68+
check_ignored_arguments(...)
69+
70+
if(!requireNamespace("survival", quietly = TRUE)){
71+
abort("Package 'survival' required.")
72+
}
73+
if(!requireNamespace("ggfortify", quietly = TRUE)){
74+
abort("Package 'ggfortify' required.")
75+
}
76+
77+
# Checks for 'status_y':
78+
stopifnot(is.numeric(status_y))
79+
stopifnot(all(status_y %in% c(0, 1)))
80+
81+
# Create basic PPC dataset:
82+
data <- ppc_data(y, yrep, group = status_y)
83+
84+
# Modify the status indicator:
85+
# * For the observed data ("y"), convert the status indicator back to
86+
# a numeric.
87+
# * For the replicated data ("yrep"), set the status indicator
88+
# to 1 ("event"). This way, the Kaplan-Meier estimator reduces
89+
# to "1 - ECDF" with ECDF denoting the ordinary empirical cumulative
90+
# distribution function.
91+
data <- data %>%
92+
dplyr::mutate(group = ifelse(.data$is_y,
93+
as.numeric(as.character(.data$group)),
94+
1))
95+
96+
# Create 'survfit' object and 'fortify' it
97+
sf <- survival::survfit(
98+
survival::Surv(value, group) ~ rep_label,
99+
data = data
100+
)
101+
fsf <- fortify(sf)
102+
103+
# Add variables specifying color, size, and alpha:
104+
fsf$is_y_color <- as.factor(sub("\\[rep\\] \\(.*$", "rep", sub("^italic\\(y\\)", "y", fsf$strata)))
105+
fsf$is_y_size <- ifelse(fsf$is_y_color == "yrep", size, 1)
106+
fsf$is_y_alpha <- ifelse(fsf$is_y_color == "yrep", alpha, 1)
107+
108+
# Ensure that the observed data gets plotted last by reordering the
109+
# levels of the factor "strata":
110+
fsf$strata <- factor(fsf$strata, levels = rev(levels(fsf$strata)))
111+
112+
# Plot:
113+
ggplot(data = fsf,
114+
mapping = aes_(x = ~ time,
115+
y = ~ surv,
116+
color = ~ is_y_color,
117+
group = ~ strata,
118+
size = ~ is_y_size,
119+
alpha = ~ is_y_alpha)) +
120+
geom_step() +
121+
hline_at(
122+
c(0, 0.5, 1),
123+
size = c(0.2, 0.1, 0.2),
124+
linetype = 2,
125+
color = get_color("dh")
126+
) +
127+
scale_size_identity() +
128+
scale_alpha_identity() +
129+
scale_color_ppc_dist() +
130+
scale_y_continuous(breaks = c(0, 0.5, 1)) +
131+
xlab(y_label()) +
132+
yaxis_title(FALSE) +
133+
xaxis_title(FALSE) +
134+
yaxis_ticks(FALSE) +
135+
bayesplot_theme_get()
136+
}

R/ppc-loo.R

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,12 @@
4242
#' PITs to the standard uniform distribution. Comparing to the uniform is not
4343
#' good for extreme probabilities close to 0 and 1, so it can sometimes be
4444
#' useful to set the `compare` argument to `"normal"`, which will
45-
#' produce a Q-Q plot comparing standardized PIT values to the standard normal
46-
#' distribution that can help see the (mis)calibration better for the extreme
47-
#' values. However, in most cases we have found that the overlaid density plot
48-
#' (`ppc_loo_pit_overlay()`) function will provided a clearer picture of
49-
#' calibration problems that the Q-Q plot.
45+
#' produce a Q-Q plot comparing standard normal quantiles calculated from the
46+
#' PIT values to the theoretical standard normal quantiles. This can help see
47+
#' the (mis)calibration better for the extreme values. However, in most cases
48+
#' we have found that the overlaid density plot (`ppc_loo_pit_overlay()`)
49+
#' function will provide a clearer picture of calibration problems than the
50+
#' Q-Q plot.
5051
#' }
5152
#' \item{`ppc_loo_intervals()`, `ppc_loo_ribbon()`}{
5253
#' Similar to [ppc_intervals()] and [ppc_ribbon()] but the intervals are for
@@ -113,8 +114,9 @@ NULL
113114
#' @param compare For `ppc_loo_pit_qq()`, a string that can be either
114115
#' `"uniform"` or `"normal"`. If `"uniform"` (the default) the Q-Q plot
115116
#' compares computed PIT values to the standard uniform distribution. If
116-
#' `compare="normal"`, the Q-Q plot compares standardized PIT values to the
117-
#' standard normal distribution.
117+
#' `compare="normal"`, the Q-Q plot compares standard normal quantiles
118+
#' calculated from the PIT values to the theoretical standard normal
119+
#' quantiles.
118120
#' @param trim Passed to [ggplot2::stat_density()].
119121
#' @template args-density-controls
120122
ppc_loo_pit_overlay <- function(y,
@@ -220,10 +222,10 @@ ppc_loo_pit_qq <- function(y,
220222
x_lab <- "Uniform"
221223
y_lab <- "LOO-PIT"
222224
} else {
223-
pit <- as.vector(scale(pit))
225+
pit <- as.vector(stats::qnorm(pit))
224226
theoretical <- stats::qnorm
225227
x_lab <- "Normal"
226-
y_lab <- "LOO-PIT (standardized)"
228+
y_lab <- "LOO-PIT (standard normal quantiles)"
227229
}
228230

229231
ggplot(data.frame(p = pit)) +

R/ppc-overview.R

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,9 @@
8787
#' multinomial outcomes.
8888
#' * [LOO predictive checks][PPC-loo]: PPC functions for predictive checks
8989
#' based on (approximate) leave-one-out (LOO) cross-validation.
90+
#' * [Censored data][PPC-censoring]: PPC functions comparing the empirical
91+
#' distribution of censored data `y` to the distributions of individual
92+
#' simulated datasets (rows) in `yrep`.
9093
#'
9194
#' @section Providing an interface for predictive checking from another package:
9295
#'

man-roxygen/reference-km.R

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
#' @references Kaplan, E. L. and Meier, P. (1958). Nonparametric estimation
2+
#' from incomplete observations.
3+
#' *Journal of the American Statistical Association*. 53(282), 457--481.
4+
#' doi:10.1080/01621459.1958.10501452.

man/PPC-censoring.Rd

Lines changed: 92 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)