@@ -80,20 +80,21 @@ various proposals in the literature for estimating CMI, which we summarize here:
8080 estimating :math: `P(y|x)` and :math: `P(y|x,z)`, which can be used as plug-in estimates
8181 to the equation for CMI.
8282
83- :mod: `pywhy_stats.fisherz ` Partial (Pearson) Correlation
84- --------------------------------------------------------
83+ :mod: `pywhy_stats.independence. fisherz ` Partial (Pearson) Correlation
84+ ---------------------------------------------------------------------
8585Partial correlation based on the Pearson correlation is equivalent to CMI in the setting
8686of normally distributed data. Computing partial correlation is fast and efficient and
8787thus attractive to use. However, this **relies on the assumption that the variables are Gaussiany **,
8888which may be unrealistic in certain datasets.
8989
90+ .. currentmodule :: pywhy_stats.independence
9091.. autosummary ::
9192 :toctree: generated/
9293
9394 fisherz
9495
95- :mod: `pywhy_stats.power_divergence ` Discrete, Categorical and Binary Data
96- -------------------------------------------------------------------------
96+ :mod: `pywhy_stats.independence. power_divergence ` Discrete, Categorical and Binary Data
97+ --------------------------------------------------------------------------------------
9798If one has discrete data, then the test to use is based on Chi-square tests. The :math: `G^2 `
9899class of tests will construct a contingency table based on the number of levels across
99100each discrete variable. An exponential amount of data is needed for increasing levels
@@ -104,8 +105,8 @@ for a discrete variable.
104105
105106 power_divergence
106107
107- Kernel-Approaches
108- -----------------
108+ :mod: ` pywhy_stats.independence.kci ` Kernel-Approaches
109+ -----------------------------------------------------
109110Kernel independence tests are statistical methods used to determine if two random variables are independent or
110111conditionally independent. One such test is the Hilbert-Schmidt Independence Criterion (HSIC), which examines the
111112independence between two random variables, X and Y. HSIC employs kernel methods and, more specifically, it computes
@@ -125,6 +126,12 @@ Kernel-based tests are attractive for many applications, since they are semi-par
125126that have been shown to be robust in the machine-learning field. For more information, see :footcite: `Zhang2011 `.
126127
127128
129+ .. currentmodule :: pywhy_stats.independence
130+ .. autosummary ::
131+ :toctree: generated/
132+
133+ kci
134+
128135Classifier-based Approaches
129136---------------------------
130137Another suite of approaches that rely on permutation testing is the classifier-based approach.
@@ -144,9 +151,9 @@ helps maintain dependence between (X, Z) and (Y, Z) (if it exists), but generate
144151conditionally independent dataset.
145152
146153
147- =======================
148- Conditional Discrepancy
149- =======================
154+ =========================================
155+ Conditional Distribution 2-Sample Testing
156+ =========================================
150157
151158.. currentmodule :: pywhy_stats
152159
@@ -170,23 +177,7 @@ indices of the distribution, one can convert the CD test:
170177:math: `P_{i=j}(y|x) =? P_{i=k}(y|x)` into the CI test :math: `P(y|x,i) = P(y|x)`, which can
171178be tested with the Chi-square CI tests.
172179
173- Kernel-Approaches
174- -----------------
175- Kernel-based tests are attractive since they are semi-parametric and use kernel-based ideas
176- that have been shown to be robust in the machine-learning field. The Kernel CD test is a test
177- that computes a test statistic from kernels of the data and uses a weighted permutation testing
178- based on the estimated propensity scores to generate samples from the null distribution
179- :footcite: `Park2021conditional `, which are then used to estimate a pvalue.
180-
181-
182- Bregman-Divergences
183- -------------------
184- The Bregman CD test is a divergence-based test
185- that computes a test statistic from estimated Von-Neumann divergences of the data and uses a
186- weighted permutation testing based on the estimated propensity scores to generate samples from the null distribution
187- :footcite: `Yu2020Bregman `, which are then used to estimate a pvalue.
188-
189180==========
190181References
191182==========
192- .. footbibliography ::
183+ .. footbibliography ::
0 commit comments