Skip to content

Commit 7bfb8b9

Browse files
authored
Short expansion of conditional mutual information (#387)
* wip... * Reorganize into separate file for the SECMI test * tests, docs * changelog + version * Slightly reorganize tests * fix tests * docs * no need to store shuffles * examples * fix deprecated syntax * reproducible tests * Add cross-references * documentation example for SECMI * Actually show docstrings * docstring * docstring * add relevant imports * reproducible tests * better description * Fix implementation for mu < 0 * CI badge for main branch only * Update changelog and version * Increase sample size to have enough points to get consistent results * better tests for secmi * a comment explaining the marginal selection * Add min/max variables for SECMI * Use SECMITest in oce tests
1 parent 1054b8f commit 7bfb8b9

File tree

22 files changed

+479
-97
lines changed

22 files changed

+479
-97
lines changed

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ name = "Associations"
22
uuid = "614afb3a-e278-4863-8805-9959372b9ec2"
33
authors = ["Kristian Agasøster Haaga <[email protected]>", "Tor Einar Møller <[email protected]>", "George Datseris <[email protected]>"]
44
repo = "https://github.com/kahaaga/Associations.jl.git"
5-
version = "4.3.0"
5+
version = "4.4.0"
66

77
[deps]
88
Accessors = "7d9f7c33-5ae7-4f3b-8dc6-eff91059b697"

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Associations
22

3-
[![CI](https://github.com/juliadynamics/Associations.jl/workflows/CI/badge.svg)](https://github.com/JuliaDynamics/Associations.jl/actions)
3+
[![CI (main)](https://github.com/juliadynamics/Associations.jl/workflows/CI/badge.svg?branch=main)](https://github.com/JuliaDynamics/Associations.jl/actions)
44
[![](https://img.shields.io/badge/docs-latest_tagged-blue.svg)](https://juliadynamics.github.io/Associations.jl/stable/)
55
[![](https://img.shields.io/badge/docs-dev_(main)-blue.svg)](https://juliadynamics.github.io/Associations.jl/dev/)
66
[![codecov](https://codecov.io/gh/JuliaDynamics/Associations.jl/branch/main/graph/badge.svg?token=0b71n6x6AP)](https://codecov.io/gh/JuliaDynamics/Associations.jl)

changelog.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,11 @@
22

33
From version v4.0 onwards, this package has been renamed to to Associations.jl.
44

5+
# 4.4
6+
7+
- New association measure: `SECMI` (`ShortExpansionConditionalMutualInformation`)
8+
- New independence test: `SECMITest`, which is based on `SECMI`.
9+
510
# 4.3
611

712
- Compatiblity with StateSpaceSets.jl v2.X

docs/refs.bib

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1333,4 +1333,14 @@ @article{Azadkia2021
13331333
pages={3070--3102},
13341334
year={2021},
13351335
publisher={Institute of Mathematical Statistics}
1336+
}
1337+
1338+
@article{Kubkowski2021,
1339+
title={How to gain on power: novel conditional independence tests based on short expansion of conditional mutual information},
1340+
author={Kubkowski, Mariusz and Mielniczuk, Jan and Teisseyre, Pawe{\l}},
1341+
journal={Journal of Machine Learning Research},
1342+
volume={22},
1343+
number={62},
1344+
pages={1--57},
1345+
year={2021}
13361346
}

docs/src/associations.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,12 @@ EmbeddingTE
107107
PartialMutualInformation
108108
```
109109

110+
### Short expansion of conditional mutual information
111+
112+
```@docs
113+
ShortExpansionConditionalMutualInformation
114+
```
115+
110116
## [Correlation measures](@id correlation_api)
111117

112118
```@docs

docs/src/examples/examples_associations.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1093,6 +1093,24 @@ est = MIDecomposition(CMIShannon(base = 2), KSG1(k = 10))
10931093
association(est, x, z, y)
10941094
```
10951095

1096+
## [`ShortExpansionConditionalMutualInformation`](@ref)
1097+
1098+
### [[`JointProbabilities`](@ref) with [`CodifyVariables`](@ref) and [`ValueBinning`](@ref)](@id example_ShortExpansionConditionalMutualInformation_JointProbabilities_CodifyVariables_ValueBinning)
1099+
1100+
```@example
1101+
using Associations
1102+
using Test
1103+
using Random; rng = Xoshiro(1234)
1104+
n = 20
1105+
x = rand(rng, n)
1106+
y = randn(rng, n) .+ x .^ 2
1107+
z = randn(rng, n) .* y
1108+
1109+
# An estimator for estimating the SECMI measure
1110+
est = JointProbabilities(SECMI(base = 2), CodifyVariables(ValueBinning(3)))
1111+
association(est, x, z, y)
1112+
```
1113+
10961114
### [[`EntropyDecomposition`](@ref) + [`Kraskov`](@ref)](@id example_CMIShannon_EntropyDecomposition_Kraskov)
10971115

10981116
Any [`DifferentialInfoEstimator`](@ref) can also be used to compute conditional

docs/src/examples/examples_independence.md

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -469,4 +469,54 @@ connecting `x` and `z`.)
469469
independence(test, x, z, y)
470470
```
471471

472-
The test verifies our expectation.
472+
The test verifies our expectation.
473+
## [[`SECMITest`](@ref)](@id example_SECMITEST)
474+
475+
## [[`JointProbabilities`](@ref) estimation on numeric data](@id example_SECMITEST_JointProbabilities_CodifyVariables_ValueBinning)
476+
477+
```@example example_SECMITEst
478+
using Associations
479+
using Test
480+
using Random; rng = Xoshiro(1234)
481+
n = 25
482+
x = rand(rng, n)
483+
y = randn(rng, n) .+ x .^ 2
484+
z = randn(rng, n) .* y
485+
486+
# An estimator for estimating the SECMI measure
487+
est = JointProbabilities(SECMI(base = 2), CodifyVariables(ValueBinning(3)))
488+
test = SECMITest(est; nshuffles = 19)
489+
```
490+
491+
When analyzing ``SECMI(x, y | z)``, the expectation is to reject the null hypothesis (independence), since `x` and `y` are connected, regardless of the effect of `z`.
492+
493+
```@example example_SECMITEst
494+
independence(test, x, y, z)
495+
```
496+
497+
We can detect this association, even for `n = 25`! When analyzing ``SECMI(x, z | y)``, we
498+
expect that we can't reject the null (indepdendence), precisely since `x` and `z` are *not*
499+
connected when "conditioning away" `y`.
500+
501+
```@example example_SECMITEst
502+
independence(test, x, z, y)
503+
```
504+
505+
## [[`JointProbabilities`](@ref) estimation on categorical data](@id example_SECMITEST_JointProbabilities_CodifyVariables_UniqueElements)
506+
507+
Note that this also works for categorical variables. Just use [`UniqueElements`](@ref) to
508+
discretize!
509+
510+
```@example example_SECMITest_categorical
511+
using Associations
512+
using Test
513+
using Random; rng = Xoshiro(1234)
514+
n = 24
515+
x = rand(rng, ["vegetables", "candy"], n)
516+
y = [xᵢ == "candy" && rand(rng) > 0.3 ? "yummy" : "yuck" for xᵢ in x]
517+
z = [yᵢ == "yummy" && rand(rng) > 0.6 ? "grown-up" : "child" for yᵢ in y]
518+
d = CodifyVariables(UniqueElements())
519+
est = JointProbabilities(SECMI(base = 2), d)
520+
521+
independence(SECMITest(est; nshuffles = 19), x, z, y)
522+
```

docs/src/independence.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,3 +50,10 @@ JDDTestResult
5050
CorrTest
5151
CorrTestResult
5252
```
53+
54+
## [`SECMITest`](@ref)
55+
56+
```@docs
57+
SECMITest
58+
SECMITestResult
59+
```

0 commit comments

Comments
 (0)