Skip to content

Commit cf24265

Browse files
Update bgmCompare function documentation
1 parent 87e7db3 commit cf24265

File tree

1 file changed

+150
-81
lines changed

1 file changed

+150
-81
lines changed

R/bgmCompare.R

Lines changed: 150 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -1,107 +1,172 @@
1-
#' Bayesian Variable Selection or Bayesian Estimation for Differences in Markov Random Fields
1+
#' Bayesian Estimation and Variable Selection for Group Differences in Markov Random Fields
22
#'
33
#' @description
4-
#' The \code{bgmCompare} function estimates the pseudoposterior distribution of
5-
#' the parameters of a Markov Random Field (MRF) model for mixed binary and ordinal
6-
#' variables, as well as differences in pairwise interactions and category thresholds
7-
#' across groups. Groups are assumed to be \code{G} independent samples.
4+
#' The \code{bgmCompare} function estimates group differences in category
5+
#' threshold parameters (main effects) and pairwise interactions (pairwise
6+
#' effects) of a Markov Random Field (MRF) for binary and ordinal variables.
7+
#' Groups can be defined either by supplying two separate datasets (\code{x} and
8+
#' \code{y}) or by a group membership vector. Optionally, Bayesian variable
9+
#' selection can be applied to identify differences across groups.
810
#'
911
#' @details
10-
#' This function models group differences in Markov Random Fields using a Bayesian
11-
#' framework. It supports binary and ordinal variables, and includes options for
12-
#' Bayesian variable selection on the differences in both pairwise interactions
13-
#' and threshold parameters. Key components are described in the sections below.
12+
#' This function extends the ordinal MRF framework
13+
#' \insertCite{MarsmanVandenBerghHaslbeck_2024;textual}{bgms} to multiple
14+
#' groups. The basic idea of modeling, analyzing, and testing group
15+
#' differences in MRFs was introduced in
16+
#' \insertCite{MarsmanWaldorpSekulovskiHaslbeck_2024;textual}{bgms}, where
17+
#' two–group comparisons were conducted using adaptive Metropolis sampling.
18+
#' The present implementation generalizes that approach to more than two
19+
#' groups and supports additional samplers (HMC and NUTS) with staged warmup
20+
#' adaptation.
1421
#'
15-
#' @section Pairwise Interactions:
16-
#' Pairwise interactions between variables \code{i} and \code{j} are modeled as:
17-
#' \deqn{\boldsymbol{\theta}_{ij} = \phi_{ij} + \boldsymbol{\delta}_{ij},}{
18-
#' \boldsymbol{\theta}_{ij} = \phi_{ij} + \boldsymbol{\delta}_{ij},}
19-
#' where:
22+
#' Key components of the model:
2023
#'
21-
#' \itemize{
22-
#' \item \eqn{\boldsymbol{\theta}_{ij}}{\boldsymbol{\theta}_{ij}} is the vector of pairwise interaction parameters of length \code{G}.
23-
#' \item \eqn{\phi_{ij}}{\phi_{ij}} is the overall pairwise interaction (nuisance parameter).
24-
#' \item \eqn{\boldsymbol{\delta}_{ij}}{\boldsymbol{\delta}_{ij}} represents group-specific differences constrained to sum to zero for identification.
25-
#' }
24+
#' @section Pairwise Interactions:
25+
#' For variables \eqn{i} and \eqn{j}, the group-specific interaction is
26+
#' represented as:
27+
#' \deqn{\theta_{ij}^{(g)} = \phi_{ij} + \delta_{ij}^{(g)},}
28+
#' where \eqn{\phi_{ij}} is the baseline effect and
29+
#' \eqn{\delta_{ij}^{(g)}} are group differences constrained to sum to zero.
2630
#'
2731
#' @section Ordinal Variables:
28-
#' The function supports two types of ordinal variables:
32+
#' \strong{Regular ordinal variables}: category thresholds are decomposed into a
33+
#' baseline plus group differences for each category.
2934
#'
30-
#' \strong{Regular ordinal variables}:
31-
#' Introduce a threshold parameter for each category except the lowest, modeled as:
32-
#' \deqn{\boldsymbol{\mu}_{ic} = \tau_{ic} + \boldsymbol{\epsilon}_{ic},}{\boldsymbol{\mu}_{ic} = \tau_{ic} + \boldsymbol{\epsilon}_{ic},}
33-
#' where:
34-
#' \itemize{
35-
#' \item \eqn{\tau_{ic}}{\tau_{ic}} denotes an overall effect (nuisance parameter).
36-
#' \item \eqn{\boldsymbol{\epsilon}_{ic}}{\boldsymbol{\epsilon}_{ic}} represents group-specific differences constrained to sum to zero.
37-
#' }
35+
#' \strong{Blume–Capel variables}: category thresholds are quadratic in the
36+
#' category index, with both the linear and quadratic terms split into a
37+
#' baseline plus group differences.
3838
#'
39-
#' \strong{Blume-Capel ordinal variables}:
40-
#' Assume a specific reference category and score responses based on distance to it:
41-
#' \deqn{\boldsymbol{\mu}_{ic} = (\tau_{i1} + \boldsymbol{\epsilon}_{i1}) \cdot c + (\tau_{i2} + \boldsymbol{\epsilon}_{i2}) \cdot (c - r)^2,}{
42-
#' \boldsymbol{\mu}_{ic} = (\tau_{i1} + \boldsymbol{\epsilon}_{i1}) \cdot c + (\tau_{i2} + \boldsymbol{\epsilon}_{i2}) \cdot (c - r)^2,}
43-
#' where:
39+
#' @section Variable Selection:
40+
#' When \code{difference_selection = TRUE}, spike-and-slab priors are
41+
#' applied to difference parameters:
4442
#' \itemize{
45-
#' \item \code{r} is the reference category.
46-
#' \item \eqn{\tau_{i1}}{\tau_{i1}} and \eqn{\tau_{i2}}{\tau_{i2}} are nuisance parameters.
47-
#' \item \eqn{\boldsymbol{\epsilon}_{i1}}{\boldsymbol{\epsilon}_{i1}} and \eqn{\boldsymbol{\epsilon}_{i2}}{\boldsymbol{\epsilon}_{i2}} represent group-specific differences.
43+
#' \item \strong{Bernoulli}: fixed prior inclusion probability.
44+
#' \item \strong{Beta–Bernoulli}: inclusion probability given a Beta prior.
4845
#' }
4946
#'
50-
#' @section Variable Selection:
51-
#' Bayesian variable selection enables testing of parameter differences or equivalence across groups. Independent spike-and-slab priors are applied to difference parameters:
47+
#' @section Sampling Algorithms and Warmup:
48+
#' Parameters are updated within a Gibbs framework, using the same
49+
#' sampling algorithms and staged warmup scheme described in
50+
#' \code{\link{bgm}}:
5251
#' \itemize{
53-
#' \item \strong{Bernoulli Model}: Assigns a fixed probability to parameter inclusion.
54-
#' \item \strong{Beta-Bernoulli Model}: Incorporates a beta prior to model inclusion probabilities.
52+
#' \item \strong{Adaptive Metropolis–Hastings}: componentwise random–walk
53+
#' proposals with Robbins–Monro adaptation of proposal SDs.
54+
#' \item \strong{Hamiltonian Monte Carlo (HMC)}: joint updates with fixed
55+
#' leapfrog trajectories; step size and optionally the mass matrix are
56+
#' adapted during warmup.
57+
#' \item \strong{No–U–Turn Sampler (NUTS)}: an adaptive HMC variant with
58+
#' dynamic trajectory lengths; warmup uses the same staged adaptation
59+
#' schedule as HMC.
5560
#' }
5661
#'
57-
#' @section Gibbs Sampling:
62+
#' For details on the staged adaptation schedule (fast–slow–fast phases),
63+
#' see \code{\link{bgm}}. In addition, when
64+
#' \code{difference_selection = TRUE}, updates of inclusion indicators are
65+
#' delayed until late warmup. In HMC/NUTS, this appends two extra phases
66+
#' (Stage-3b and Stage-3c), so that the total number of warmup iterations
67+
#' exceeds the user-specified \code{warmup}.
68+
#'
69+
#' After warmup, adaptation is disabled: step size and mass matrix are fixed
70+
#' at their learned values, and proposal SDs remain constant.
5871
#'
59-
#' Parameters are estimated using a Metropolis-within-Gibbs sampling scheme.
60-
#' When \code{difference_selection = TRUE}, the algorithm runs \code{2 * warmup} warmup iterations:
72+
#' @param x A data frame or matrix of binary and ordinal responses for
73+
#' Group 1. Variables should be coded as nonnegative integers starting at
74+
#' 0. For ordinal variables, unused categories are collapsed; for
75+
#' Blume–Capel variables, all categories are retained.
76+
#' @param y Optional data frame or matrix for Group 2 (two-group designs).
77+
#' Must have the same variables (columns) as \code{x}.
78+
#' @param group_indicator Optional integer vector of group memberships for
79+
#' rows of \code{x} (multi-group designs). Ignored if \code{y} is supplied.
80+
#' @param difference_selection Logical. If \code{TRUE}, spike-and-slab priors
81+
#' are applied to difference parameters. Default: \code{TRUE}.
82+
#' @param variable_type Character vector specifying type of each variable:
83+
#' \code{"ordinal"} (default) or \code{"blume-capel"}.
84+
#' @param baseline_category Integer or vector giving the baseline category
85+
#' for Blume–Capel variables.
86+
#' @param difference_scale Double. Scale of the Cauchy prior for difference
87+
#' parameters. Default: \code{1}.
88+
#' @param difference_prior Character. Prior for difference inclusion:
89+
#' \code{"Bernoulli"} or \code{"Beta-Bernoulli"}. Default: \code{"Bernoulli"}.
90+
#' @param difference_probability Numeric. Prior inclusion probability for
91+
#' differences (Bernoulli prior). Default: \code{0.5}.
92+
#' @param beta_bernoulli_alpha,beta_bernoulli_beta Doubles. Shape parameters
93+
#' of the Beta prior for inclusion probabilities in the Beta–Bernoulli
94+
#' model. Defaults: \code{1}.
95+
#' @param pairwise_scale Double. Scale of the Cauchy prior for baseline
96+
#' pairwise interactions. Default: \code{2.5}.
97+
#' @param main_alpha,main_beta Doubles. Shape parameters of the beta-prime
98+
#' prior for baseline threshold parameters. Defaults: \code{0.5}.
99+
#' @param iter Integer. Number of post–warmup iterations per chain.
100+
#' Default: \code{1e3}.
101+
#' @param warmup Integer. Number of warmup iterations before sampling.
102+
#' Default: \code{1e3}.
103+
#' @param na_action Character. How to handle missing data:
104+
#' \code{"listwise"} (drop rows) or \code{"impute"} (impute within Gibbs).
105+
#' Default: \code{"listwise"}.
106+
#' @param display_progress Character. Controls progress reporting:
107+
#' \code{"per-chain"}, \code{"total"}, or \code{"none"}.
108+
#' Default: \code{"per-chain"}.
109+
#' @param update_method Character. Sampling algorithm:
110+
#' \code{"adaptive-metropolis"}, \code{"hamiltonian-mc"}, or \code{"nuts"}.
111+
#' Default: \code{"nuts"}.
112+
#' @param target_accept Numeric between 0 and 1. Target acceptance rate.
113+
#' Defaults: 0.44 (Metropolis), 0.65 (HMC), 0.60 (NUTS).
114+
#' @param hmc_num_leapfrogs Integer. Leapfrog steps for HMC. Default: \code{100}.
115+
#' @param nuts_max_depth Integer. Maximum tree depth for NUTS. Default: \code{10}.
116+
#' @param learn_mass_matrix Logical. If \code{TRUE}, adapt the mass matrix
117+
#' during warmup (HMC/NUTS only). Default: \code{FALSE}.
118+
#' @param chains Integer. Number of parallel chains. Default: \code{4}.
119+
#' @param cores Integer. Number of CPU cores. Default:
120+
#' \code{parallel::detectCores()}.
121+
#' @param seed Optional integer. Random seed for reproducibility.
122+
#'
123+
#' @return
124+
#' A list of class \code{"bgmCompare"} containing posterior summaries,
125+
#' posterior mean matrices, and raw MCMC samples:
61126
#' \itemize{
62-
#' \item First half without difference selection.
63-
#' \item Second half with edge selection enabled.
127+
#' \item \code{posterior_summary_main_baseline},
128+
#' \code{posterior_summary_pairwise_baseline}: summaries of baseline
129+
#' thresholds and pairwise interactions.
130+
#' \item \code{posterior_summary_main_differences},
131+
#' \code{posterior_summary_pairwise_differences}: summaries of group
132+
#' differences in thresholds and pairwise interactions.
133+
#' \item \code{posterior_summary_indicator}: summaries of inclusion
134+
#' indicators (if \code{difference_selection = TRUE}).
135+
#' \item \code{posterior_mean_main_baseline},
136+
#' \code{posterior_mean_pairwise_baseline}: posterior mean matrices
137+
#' (legacy style).
138+
#' \item \code{raw_samples}: list of raw draws per chain for main,
139+
#' pairwise, and indicator parameters.
140+
#' \item \code{arguments}: list of function call arguments and metadata.
64141
#' }
65-
#' This warmup strategy improves stability of adaptive Metropolis-Hastings proposals and starting values.
66142
#'
67-
#' @section Saving Options:
68-
#' Users can store sampled states for parameters (\code{main_effects}, \code{pairwise_effects}, \code{indicator}) during Gibbs sampling. Enabling these options (\code{save_main}, \code{save_pairwise}, \code{save_indicator}) increases output size and memory usage, so use them judiciously.
143+
#' The \code{summary()} method prints formatted summaries,
144+
#' \code{coef()} extracts posterior means, and \code{as_draws()} converts
145+
#' raw samples to a \pkg{posterior} \code{draws_df}.
69146
#'
70-
#' @param x Data frame or matrix with binary and ordinal responses. Regular ordinal variables should be coded as integers starting from 0. Missing categories are collapsed for regular ordinal variables but retained for Blume-Capel variables.
71-
#' @param y A data frame or matrix similar to \code{x}, used for two-group designs. \code{x} contains Group 1 data, and \code{y} contains Group 2 data. Ignored for multi-group designs.
72-
#' @param g Group membership vector for rows in \code{x}. Required for multi-group designs; ignored if \code{y} is provided.
73-
#' @param difference_selection Logical. If \code{TRUE}, the function models the inclusion or exclusion of parameter differences. Default: \code{TRUE}.
74-
#' @param save_main,save_pairwise,save_indicator Logical. Enable saving sampled states for \code{main_effects}, \code{pairwise_effects}, and \code{indicator}, respectively. Default: \code{FALSE}.
75-
#' @param main_difference_model Character. Specifies how to handle threshold differences when categories are unmatched. Options: \code{"Collapse"}, \code{"Free"}. The option "Collapse" collapses categories unobserved in one or more groups. The option "Free" option estimates thresholds separately for each group and does not model their difference. Default: \code{"Free"}.
76-
#' @param variable_type Character or vector. Specifies the type of variables in \code{x} (\code{"ordinal"} or \code{"blume-capel"}). Default: \code{"ordinal"}.
77-
#' @param reference_category Integer or vector. Reference category for Blume-Capel variables. Required if there is at least one Blume-Capel variable.
78-
#' @param pairwise_difference_scale Double. Scale parameter for the Cauchy prior on pairwise differences. Default: \code{1}.
79-
#' @param main_difference_scale Double. Scale parameter for the Cauchy prior on threshold differences. Default: \code{1}.
80-
#' @param pairwise_difference_prior,main_difference_prior Character. Specifies the inclusion probability model (\code{"Bernoulli"} or \code{"Beta-Bernoulli"}). Default: \code{"Bernoulli"}.
81-
#' @param pairwise_difference_probability A numeric value or a \eqn{p \times p} matrix specifying the prior inclusion probability of a pairwise difference in the Bernoulli model. A single value applies the same probability to all pairs, while a matrix allows for edge-specific probabilities. Default: 0.5 for equal prior probability for inclusion and exclusion.
82-
#' @param main_difference_probability A numeric value or a length-\eqn{p} vector specifying the prior inclusion probability of a threshold difference in the Bernoulli model. A single value applies the same probability to all variables, while a vector allows for variable-specific probabilities. Default: 0.5 to indicate no prior preference.
83-
#' @param iter,warmup Integer. Number of Gibbs iterations (\code{iter}) and burn-in iterations (\code{warmup}). Defaults: \code{iter = 1e4}, \code{warmup = 1e3}.
84-
#' @param na_action Character. Specifies handling of missing data. \code{"listwise"} deletes rows with missing values; \code{"impute"} imputes values during Gibbs sampling. Default: \code{"listwise"}.
85-
#' @param display_progress Logical. Show progress bar during computation. Default: \code{TRUE}.
86-
#' @param main_alpha,main_beta Double. Shape parameters for the beta-prime prior on nuisance threshold parameters.
87-
#' @param pairwise_scale Double. Scale of the Cauchy prior for nuisance pairwise interactions. Default: \code{2.5}.
88-
#' @param main_beta_bernoulli_alpha,main_beta_bernoulli_beta Double. Shape parameters for the Beta-Bernoulli prior on threshold differences.
89-
#' @param pairwise_beta_bernoulli_alpha,pairwise_beta_bernoulli_beta Double. Shape parameters for the Beta-Bernoulli prior on pairwise differences.
90-
#' @param save Logical. If true, sampled states for all parameters are returned. Deprecated.
91-
#' @param save_main,save_pairwise,save_indicator Logical. Enable saving sampled states for `main_effects`, `pairwise_effects`, and `indicator`, respectively. Default: `FALSE`.
147+
#' NUTS diagnostics (tree depth, divergences, energy, E-BFMI) are included
148+
#' in \code{fit$nuts_diag} if \code{update_method = "nuts"}.
92149
#'
93-
#' @return A list containing the posterior means and, optionally, sampled states based on the \code{save_*} options. The returned components include:
94-
#' \itemize{
95-
#' \item \code{posterior_mean_main}, \code{posterior_mean_pairwise}, and \code{posterior_mean_indicator} for posterior means.
96-
#' \item If saving options are enabled, the list also includes:
97-
#' \itemize{
98-
#' \item \code{raw_samples_main} – sampled states of main effects.
99-
#' \item \code{raw_samples_pairwise} – sampled states of pairwise effects.
100-
#' \item \code{raw_samples_indicator} – sampled states of inclusion indicators.
101-
#' }
150+
#' @references
151+
#' \insertAllCited{}
152+
#'
153+
#' @examples
154+
#' \dontrun{
155+
#' # Run bgmCompare on subset of the Boredom dataset
156+
#' x = Boredom[Boredom$language == "fr", 2:6]
157+
#' y = Boredom[Boredom$language != "fr", 2:6]
158+
#'
159+
#' fit <- bgmCompare(x, y)
160+
#'
161+
#' # Posterior inclusion probabilities
162+
#' summary(fit)$indicator
163+
#'
164+
#' # Bayesian model averaged main effects for the groups
165+
#' coef(fit)$main_effects_groups
166+
#'
167+
#' # Bayesian model averaged pairwise effects for the groups
168+
#' coef(fit)$pairwise_effects_groups
102169
#' }
103-
#' In addition to the results of the analysis, the output lists some of the
104-
#' arguments of its call. This is useful for post-processing the results.
105170
#'
106171
#' @export
107172
bgmCompare = function(
@@ -377,5 +442,9 @@ bgmCompare = function(
377442
output$nuts_diag = nuts_diag
378443
}
379444

445+
userInterrupt = any(vapply(out, FUN = `[[`, FUN.VALUE = logical(1L), "userInterrupt"))
446+
if (userInterrupt)
447+
warning("Stopped sampling after user interrupt, results are likely uninterpretable.")
448+
380449
return(output)
381450
}

0 commit comments

Comments
 (0)