Skip to content

Commit f3b9fc2

Browse files
committed
explain CAR, ICAR more clearly
1 parent a9f3125 commit f3b9fc2

File tree

1 file changed

+71
-65
lines changed

1 file changed

+71
-65
lines changed

knitr/car-iar-poisson/icar_stan.Rmd

Lines changed: 71 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -48,77 +48,76 @@ All commands should be run from the directory `stan-dev/example-models/knitr/car
4848

4949
## About conditional autoregressive models
5050

51-
CAR models are used for areal data consisting of a single aggregated measure
52-
per areal unit, which may be a binary, count, or continuous value.
51+
CAR and ICAR models are used when areal data consists
52+
of a single aggregated measure per areal unit,
53+
either a binary, count, or continuous value.
5354
Areal units are volumes, more precisely,
5455
areal units partition a multi-dimensional volume D into a finite number
5556
of sub-volumes with well-defined boundaries.
5657
Areal data differs from point data,
5758
which consists of measurements from a known set of geo-spatial points.
5859
For point data, the relationship between points is a
59-
continuous, real-valued distance measure.
60-
For areal units, the relationship between areal units is characterized in
61-
terms of adjacency.
62-
63-
Given a set of observations taken at $n$ different areal units of a region,
60+
continuous, real-valued distance measure which can be calculated
61+
automatically for any two points on the map,
62+
allowing for the addition of new points to the map.
63+
Given a set of areal units, there is no automatic procedure
64+
for adding a new areal unit, thus models for areal data are
65+
non-generative with respect to the areal regions.
66+
67+
For a set of $N$ areal units, the relationship between areal units
68+
is described by an $N \times N$ adjacency matrix,
69+
which is usually written $A$ for adjacency, or $W$ for weights.
70+
For the binary _neighbor_ relationship,
71+
written $i \sim j$ where $i \neq j$, the entries in the adjacency matrix
72+
are $1$ if regions $n_i$ and $n_j$ are neighbors and is otherwise $0$.
73+
For CAR models, the neighbor relationship is symmetric but not reflexive;
74+
if $i \sim j$ then $j \sim i$, but a region is not its own neighbor.
75+
76+
### Conditional Autoregressive (CAR) Models
77+
78+
Given a set of observations taken at $N$ different areal units of a region,
6479
spatial interactions between a pair of units $n_i$ and $n_j$ can be modelled conditionally
6580
as a spatial random variable $\mathbf{\phi}$, which is an $n$-length vector
6681
$\mathbf{\phi} = ({\phi}_1, \ldots, {\phi}_n)^T$.
6782

68-
For CAR models, spatial relationship between the $n$ areal units
69-
are represented as an adjacency matrix $W$ with dimensions $n \times n$
70-
where entries $w_{i,j}$ and $w_{j,i}$ are positive when regions $i$ and $j$ are neighbors
71-
and zero otherwise.
72-
The _neighbor_ relationship $i \sim j$ is defined in terms of this matrix:
73-
the neighbors of region $i$ are those regions who have non-zero entries in row or column $i$.
74-
This encoding defines a lattice structure over the $n$ areal units.
75-
76-
77-
### Conditional Auto-Regressive (CAR) Models
78-
79-
Besag (1974) motivates CAR models for spatial processes using
80-
results from the physics of lattice systems of particles and
81-
the Hammersley-Clifford theorem which provides an equivalence between
82-
a local specification of the conditional distribution of each particle
83-
given its neighboring particles and the global specification
84-
of the joint distribution of all particles.
85-
This specification of the joint distribution via the local specification
86-
of the conditional distributions of the individual variables
87-
is a Markov random field (MRF) specification.
88-
The conditional distribution for each ${\phi}_i$ is specified in terms of a mean
89-
and precision parameter $\tau$ as:
90-
91-
$$ p \left( { \phi }_i \, \vert\, {\phi}_j \, j \neq i, {{\tau}_i}^{-1} \right)
92-
= \mathit{N} \left( \alpha \sum_{i \sim j} {w}_{i,j} {\phi}_j,\tau_i^{-1} \right), i,j = 1, \ldots, n $$
93-
94-
The parameter $\alpha$ controls the strength of the spatial association,
95-
where $\alpha = 0$ corresponds to spatial independence.
96-
97-
The corresponding joint distribution can be uniquely determined from
98-
the set of full conditional distributions by
99-
introducing a fixed point from the support of $p$
100-
and then using Brook’s Lemma to factor the set of conditional distributions
101-
into a joint distribution which is determined up to a proportionality constant
102-
(see Banerjee, Carlin, and Gelfand, 2004, sec. 3.2) as a Gaussian with mean and precision parameters:
103-
104-
$$ \mathbf{\phi} \sim \mathit{N} \left(\mathbf{0}, \left[D_{\tau}(I - \alpha B)\right]^{-1} \right) $$
83+
In the full conditional distribution, each ${\phi}_i$ is conditional
84+
on the sum of the weighted values of its neighbors ($w_{ij}\,\phi_j$)
85+
and has unknown variance
86+
$$\phi_i \mid \phi_j, j \neq i, \sim \mathrm{N} \left( \sum_{j = 1}^n w_{ij} \phi_j, {\sigma}^2 \right).$$
87+
\
88+
Specification of the global, or joint distribution via the local specification
89+
of the conditional distributions of the individual random variables
90+
defines a Gaussian Markov random field (GMRF).
91+
Besag (1974) proved that the
92+
corresponding joint specification of $\phi$ is a multivariate normal random variable
93+
centered at $0$.
94+
The variance of $\phi$ is specified as a precision matrix $Q$ which
95+
is simply the inverse of the covariance matrix $\Sigma$, i.e. $\Sigma = Q^{-1}$
96+
so that
97+
$$\phi \sim \mathrm{N}(0, Q^{-1}).$$
98+
99+
In order for the standard multivariate normal random variable $\phi$ to have a proper joint probability density,
100+
the precision matrix $Q$ must be symmetric and positive definite.
101+
This is accomplished by constructing the precision matrix $Q$ from the adjacency matrix $W$:
102+
103+
$$ Q = [D_{\tau}(I - \alpha B)] $$
105104

106105
where
107106

108107
- $W$ is the $n \times n$ adjacency matrix where entries $\{i,i\}$ are zero and the off-diagonal elements
109108
are $1$ if regions $i$ and $j$ are neighbors and $0$ otherwise.
110-
- $D$ is the $n \times n$ diagonal where entries $\{i,i\}$ are the number of neighbors of region $i$
111-
and the off-diagnoal entries are $0$.
112-
- $D_{\tau} = \tau D$
113-
- $\alpha$ is between 0 and 1
114-
- $B$ is the scaled adjacency matrix $D^{-1}W$
115-
- $I$ is an $n \times n$ identity matrix
116-
117-
Since $D_{\tau} = \tau D$ and $B = D^{-1}W$, $[D_{\tau}(I - \alpha B)]^{-1}$
118-
rewrites to $[{\tau}(D - \alpha W)]^{-1}$.
119-
In the case where $\alpha < 0$, the precision matrix is positive definite,
120-
thus the joint distribution is proper.
121-
However evaluation of the joint distribution requires computing the determinant of this matrix,
109+
- $D$ is the $n \times n$ diagonal matrix where entries $\{i,i\}$ are the number of neighbors of region $i$
110+
and the off-diagonal entries are $0$.
111+
- $D_{\tau} = \tau\, D$.
112+
- $\alpha$ a parameter which controls the amount of spatial correlation, where
113+
where $\alpha = 0$ implies spatial independence and where $\alpha = 1$ implies complete spatial correlation.
114+
- $B$ is the scaled adjacency matrix $D^{-1}W$.
115+
- $I$ is an $n \times n$ identity matrix.
116+
117+
When $\alpha$ is in the interval (0,1), the precision matrix $Q$ is positive definite,
118+
thus the joint distribution $\phi$ is proper.
119+
120+
Evaluation of $\phi$ requires computing the determinant of the precision matrix $Q$,
122121
which is computationally expensive.
123122
See the Stan case study
124123
[Exact sparse CAR models in Stan](http://mc-stan.org/documentation/case-studies/mbjoseph-CARStan.html),
@@ -127,11 +126,24 @@ for discussion of how to speed up computation.
127126
### Intrinsic Conditional Auto-Regressive (ICAR) models
128127

129128
An Intrinsic Conditional Auto-Regressive (ICAR) model is a CAR model where $\alpha = 1$,
130-
so that the joint distribution simplifies to,
129+
that is, it assumes complete spatial correlation between regions.
130+
(Spoiler alert: this assumption is problematic, resulting in the
131+
the BYM model and successors).
132+
The joint distribution of the ICAR model is derived from the joint distribution
133+
for the CAR model as follows:
134+
135+
- since $D_{\tau} = \tau D$ and $B = D^{-1}W$, the expression $[D_{\tau}(I - \alpha B)]$
136+
simplifies to $[{\tau}(D - \alpha W)]$.
137+
- since $\alpha = 1$, it is omitted.
138+
139+
The resulting matrix $[\tau \, (D - W)]$ is singular, thus the ICAR variate $\phi$
140+
is an improper prior distribution, with joint distribution:
131141

132142
$$\phi \sim N(0, [\tau \, (D - W)]^{-1}).$$
133143

134-
resulting in a singular precision matrix and an improper prior distribution.
144+
While this ICAR model is non-generating in that it cannot be used as a model for the data,
145+
it can be used as a prior as part of a hierarchical model, which is the role it plays in
146+
the BYM model.
135147

136148
The corresponding conditional distribution specification is:
137149

@@ -152,16 +164,10 @@ where $NC$ is the number of components in the graph over all areal subregions de
152164
$NC == 1$ when the areal graph is fully connected, i.e., every subregion can be reached from every other subregion
153165
via a sequence of neighbors.
154166

155-
The above conditions for the ICAR model produce an improper distribution
156-
because setting $\alpha = 1$ creates a singular matrix $(D - W)$, see Besag and Kooperberg 1995.
157-
Furthermore, the joint distribution is non-identifiable;
167+
From the pairwise difference formulation, we see that the joint distribution is non-identifiable;
158168
adding any constant to all of the elements of $\phi$ leaves the joint distribution unchanged.
159169
Adding the constraint $\sum_{i} {\phi}_i = 0$ resolves this problem.
160170

161-
While this ICAR model is non-generating in that it cannot be used as a model for the data,
162-
it can be used as a prior as part of a hierarchical model, which is the role it plays in
163-
the BYM model.
164-
165171
### Derivation of the _Pairwise Difference_ Formula
166172

167173
The jump from the joint distribution to the pairwise difference requires

0 commit comments

Comments
 (0)