Skip to content

Commit 4c43c06

Browse files
authored
Merge pull request #71 from MarkDana/cit_document_update
Updated some documents in compatible with the CIT class
2 parents 64e58b1 + e531b5a commit 4c43c06

File tree

6 files changed

+153
-38
lines changed

6 files changed

+153
-38
lines changed

docs/source/independence_tests_index/chisq.rst

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,25 +5,39 @@ Chi-Square test
55

66
Perform an independence test on discrete variables using Chi-Square test.
77

8-
(We have updated the independence test class and the usage example hasn't been updated yet. For new class, please refer to `TestCIT.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT.py>`_ or `TestCIT_KCI.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT_KCI.py>`_.)
9-
108
Usage
119
--------
1210
.. code-block:: python
1311
12+
from causallearn.utils.cit import CIT
13+
chisq_obj = CIT(data, "chisq") # construct a CIT instance with data and method name
14+
pValue = chisq_obj(X, Y, S)
15+
16+
Please be kindly informed that we have refactored the independence tests from functions to classes since the release `v0.1.2.8 <https://github.com/cmu-phil/causal-learn/releases/tag/0.1.2.8>`_. Speed gain and a more flexible parameters specification are enabled.
17+
18+
For users, you may need to adjust your codes accordingly. Specifically, if you are
19+
20+
+ running a constraint-based algorithm from end to end: then you don't need to change anything. Old codes are still compatible. For example,
21+
.. code-block:: python
22+
23+
from causallearn.search.ConstraintBased.PC import pc
1424
from causallearn.utils.cit import chisq
15-
p = chisq(data, X, Y, conditioning_set)
25+
cg = pc(data, 0.05, chisq)
26+
27+
+ explicitly calculating the p-value of a test: then you need to declare the :code:`chisq_obj` and then call it as above, instead of using :code:`chisq(data, X, Y, condition_set)` as before. Note that now :code:`causallearn.utils.cit.chisq` is a string :code:`"chisq"`, instead of a function.
28+
29+
Please see `CIT.py <https://github.com/cmu-phil/causal-learn/blob/main/causallearn/utils/cit.py>`_
30+
for more details on the implementation of the (conditional) independent tests.
1631

1732

1833
Parameters
1934
----------------
2035
**data**: numpy.ndarray, shape (n_samples, n_features). Data, where n_samples is the number of samples
2136
and n_features is the number of features.
2237

23-
**X, Y and condition_set**: column indices of data.
38+
**method**: string, "chisq".
2439

25-
**G_sq**: True means using G-Square test;
26-
False means using Chi-Square test.
40+
**kwargs**: e.g., :code:`cache_path`. See :ref:`Advanced Usages <Advanced Usages>`.
2741

2842
Returns
2943
-------------

docs/source/independence_tests_index/fisherz.rst

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,24 +5,40 @@ Fisher-z test
55

66
Perform an independence test using Fisher-z's test [1]_. This test is optimal for linear-Gaussian data.
77

8-
(We have updated the independence test class and the usage example hasn't been updated yet. For new class, please refer to `TestCIT.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT.py>`_ or `TestCIT_KCI.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT_KCI.py>`_.)
9-
108

119
Usage
1210
--------
1311
.. code-block:: python
1412
13+
from causallearn.utils.cit import CIT
14+
fisherz_obj = CIT(data, "fisherz") # construct a CIT instance with data and method name
15+
pValue = fisherz_obj(X, Y, S)
16+
17+
Please be kindly informed that we have refactored the independence tests from functions to classes since the release `v0.1.2.8 <https://github.com/cmu-phil/causal-learn/releases/tag/0.1.2.8>`_. Speed gain and a more flexible parameters specification are enabled.
18+
19+
For users, you may need to adjust your codes accordingly. Specifically,
20+
21+
+ If you are running a constraint-based algorithm from end to end: then you don't need to change anything. Old codes are still compatible. For example,
22+
.. code-block:: python
23+
24+
from causallearn.search.ConstraintBased.PC import pc
1525
from causallearn.utils.cit import fisherz
16-
p = fisherz(data, X, Y, condition_set, correlation_matrix)
26+
cg = pc(data, 0.05, fisherz)
27+
28+
+ If you are explicitly calculating the p-value of a test: then you need to declare the :code:`fisherz_obj` and then call it as above, instead of using :code:`fisherz(data, X, Y, condition_set)` as before. Note that now :code:`causallearn.utils.cit.fisherz` is a string :code:`"fisherz"`, instead of a function.
29+
30+
31+
Please see `CIT.py <https://github.com/cmu-phil/causal-learn/blob/main/causallearn/utils/cit.py>`_
32+
for more details on the implementation of the (conditional) independent tests.
1733

1834
Parameters
1935
------------
2036
**data**: numpy.ndarray, shape (n_samples, n_features). Data, where n_samples is the number of samples
2137
and n_features is the number of features.
2238

23-
**X, Y and condition_set**: column indices of data.
39+
**method**: string, "fisherz".
2440

25-
**correlation_matrix**: correlation matrix; None means without the parameter of correlation matrix.
41+
**kwargs**: e.g., :code:`cache_path`. See :ref:`Advanced Usages <Advanced Usages>`.
2642

2743
Returns
2844
-------------

docs/source/independence_tests_index/gsq.rst

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,24 +5,38 @@ G-Square test
55

66
Perform an independence test using G-Square test [1]_. This test is based on the log likelihood ratio test.
77

8-
(We have updated the independence test class and the usage example hasn't been updated yet. For new class, please refer to `TestCIT.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT.py>`_ or `TestCIT_KCI.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT_KCI.py>`_.)
9-
10-
118
Usage
129
--------
1310
.. code-block:: python
1411
12+
from causallearn.utils.cit import CIT
13+
gsq_obj = CIT(data, "gsq") # construct a CIT instance with data and method name
14+
pValue = gsq_obj(X, Y, S)
15+
16+
Please be kindly informed that we have refactored the independence tests from functions to classes since the release `v0.1.2.8 <https://github.com/cmu-phil/causal-learn/releases/tag/0.1.2.8>`_. Speed gain and a more flexible parameters specification are enabled.
17+
18+
For users, you may need to adjust your codes accordingly. Specifically, if you are
19+
20+
+ running a constraint-based algorithm from end to end: then you don't need to change anything. Old codes are still compatible. For example,
21+
.. code-block:: python
22+
23+
from causallearn.search.ConstraintBased.PC import pc
1524
from causallearn.utils.cit import gsq
16-
p = gsq(data, X, Y, conditioning_set)
25+
cg = pc(data, 0.05, gsq)
26+
27+
+ explicitly calculating the p-value of a test: then you need to declare the :code:`gsq_obj` and then call it as above, instead of using :code:`gsq(data, X, Y, condition_set)` as before. Note that now :code:`causallearn.utils.cit.gsq` is a string :code:`"gsq"`, instead of a function.
28+
29+
Please see `CIT.py <https://github.com/cmu-phil/causal-learn/blob/main/causallearn/utils/cit.py>`_
30+
for more details on the implementation of the (conditional) independent tests.
1731

1832
Parameters
1933
-------------
2034
**data**: numpy.ndarray, shape (n_samples, n_features). Data, where n_samples is the number of samples
2135
and n_features is the number of features.
2236

23-
**X, Y and condition_set**: column indices of data.
37+
**method**: string, "gsq".
2438

25-
**G_sq**: True means using G-Square test; False means using Chi-Square test.
39+
**kwargs**: e.g., :code:`cache_path`. See :ref:`Advanced Usages <Advanced Usages>`.
2640

2741
Returns
2842
---------------

docs/source/independence_tests_index/kci.rst

Lines changed: 41 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -7,37 +7,65 @@ Kernel-based conditional independence (KCI) test and independence test [1]_.
77
To test if x and y are conditionally or unconditionally independent on Z. For unconditional independence tests,
88
Z is set to the empty set.
99

10-
(We have updated the independence test class and the usage example hasn't been updated yet. For new class, please refer to `TestCIT.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT.py>`_ or `TestCIT_KCI.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT_KCI.py>`_.)
11-
12-
1310
Usage
1411
--------
1512
.. code-block:: python
1613
14+
from causallearn.utils.cit import CIT
15+
kci_obj = CIT(data, "kci") # construct a CIT instance with data and method name
16+
pValue = kci_obj(X, Y, S)
17+
18+
The above code runs KCI with the default parameters. Or instead if you would like to specify some parameters of KCI, you may do it by e.g.,
19+
20+
.. code-block:: python
21+
22+
kci_obj = CIT(data, "kci", kernelZ='Polynomial', approx=False, est_width='median', ...)
23+
24+
See `KCI.py <https://github.com/cmu-phil/causal-learn/blob/main/causallearn/utils/KCI/KCI.py>`_
25+
for more details on the parameters options of the KCI tests.
26+
27+
28+
Please be kindly informed that we have refactored the independence tests from functions to classes since the release `v0.1.2.8 <https://github.com/cmu-phil/causal-learn/releases/tag/0.1.2.8>`_. Speed gain and a more flexible parameters specification are enabled.
29+
30+
For users, you may need to adjust your codes accordingly. Specifically, if you are
31+
32+
+ running a constraint-based algorithm from end to end: then you don't need to change anything. Old codes are still compatible. For example,
33+
.. code-block:: python
34+
35+
from causallearn.search.ConstraintBased.PC import pc
1736
from causallearn.utils.cit import kci
18-
p = kci(data, X, Y, condition_set, kernelX, kernelY, kernelZ, est_width, polyd, kwidthx, kwidthy, kwidthz)
37+
cg = pc(data, 0.05, kci)
38+
39+
+ explicitly calculating the p-value of a test: then you need to declare the :code:`kci_obj` and then call it as above, instead of using :code:`kci(data, X, Y, condition_set)` as before. Note that now :code:`causallearn.utils.cit.kci` is a string :code:`"kci"`, instead of a function.
40+
41+
Please see `CIT.py <https://github.com/cmu-phil/causal-learn/blob/main/causallearn/utils/cit.py>`_
42+
for more details on the implementation of the (conditional) independent tests.
1943

2044
Parameters
21-
-------------
45+
------------
2246
**data**: numpy.ndarray, shape (n_samples, n_features). Data, where n_samples is the number of samples
2347
and n_features is the number of features.
2448

25-
**X, Y, and condition_set**: column indices of data. condition_set could be None.
49+
**method**: string, "kci".
50+
51+
**kwargs**:
2652

27-
**KernelX/Y/Z (condition_set)**: ['GaussianKernel', 'LinearKernel', 'PolynomialKernel'].
28-
(For 'PolynomialKernel', the default degree is 2. Currently, users can change it by setting the 'degree' of 'class PolynomialKernel()'.
53+
+ Either for specifying parameters of KCI, including:
2954

30-
**est_width**: set kernel width for Gaussian kernels.
55+
**KernelX/Y/Z (condition_set)**: ['GaussianKernel', 'LinearKernel', 'PolynomialKernel']. (For 'PolynomialKernel', the default degree is 2. Currently, users can change it by setting the 'degree' of 'class PolynomialKernel()'.
56+
57+
**est_width**: set kernel width for Gaussian kernels.
3158
- 'empirical': set kernel width using empirical rules (default).
3259
- 'median': set kernel width using the median trick.
3360

34-
**polyd**: polynomial kernel degrees (default=2).
61+
**polyd**: polynomial kernel degrees (default=2).
62+
63+
**kwidthx/y/z**: kernel width for data x/y/z (standard deviation sigma).
3564

36-
**kwidthx**: kernel width for data x (standard deviation sigma).
65+
**and more**: aee `KCI.py <https://github.com/cmu-phil/causal-learn/blob/main/causallearn/utils/KCI/KCI.py>`_ for details.
3766

38-
**kwidthy**: kernel width for data y (standard deviation sigma).
67+
+ Or for advanced usages of CIT, e.g., :code:`cache_path`. See :ref:`Advanced Usages <Advanced Usages>`.
3968

40-
**kwidthz**: kernel width for data z (standard deviation sigma).
4169

4270
Returns
4371
-----------

docs/source/independence_tests_index/mvfisherz.rst

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,23 +6,39 @@ Missing-value Fisher-z test
66
Perform a testwise-deletion Fisher-z independence test to data sets with missing values.
77
With testwise-deletion, the test makes use of all data points that do not have missing values for the variables involved in the test.
88

9-
(We have updated the independence test class and the usage example hasn't been updated yet. For new class, please refer to `TestCIT.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT.py>`_ or `TestCIT_KCI.py <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestCIT_KCI.py>`_.)
10-
11-
129
Usage
1310
--------
1411
.. code-block:: python
1512
13+
from causallearn.utils.cit import CIT
14+
mv_fisherz_obj = CIT(data_with_missingness, "mv_fisherz") # construct a CIT instance with data and method name
15+
pValue = mv_fisherz_obj(X, Y, S)
16+
17+
Please be kindly informed that we have refactored the independence tests from functions to classes since the release `v0.1.2.8 <https://github.com/cmu-phil/causal-learn/releases/tag/0.1.2.8>`_. Speed gain and a more flexible parameters specification are enabled.
18+
19+
For users, you may need to adjust your codes accordingly. Specifically, if you are
20+
21+
+ running a constraint-based algorithm from end to end: then you don't need to change anything. Old codes are still compatible. For example,
22+
.. code-block:: python
23+
24+
from causallearn.search.ConstraintBased.PC import pc
1625
from causallearn.utils.cit import mv_fisherz
17-
p = mv_fisherz(mvdata, X, Y, condition_set)
26+
cg = pc(data_with_missingness, 0.05, mv_fisherz)
27+
28+
+ explicitly calculating the p-value of a test: then you need to declare the :code:`mv_fisherz_obj` and then call it as above, instead of using :code:`mv_fisherz(data, X, Y, condition_set)` as before. Note that now :code:`causallearn.utils.cit.mv_fisherz` is a string :code:`"mv_fisherz"`, instead of a function.
29+
30+
Please see `CIT.py <https://github.com/cmu-phil/causal-learn/blob/main/causallearn/utils/cit.py>`_
31+
for more details on the implementation of the (conditional) independent tests.
1832

1933

2034
Parameters
21-
---------------
22-
**mvdata**: numpy.ndarray, shape (n_samples, n_features). Data with missing value, where n_samples is the number of samples
35+
------------
36+
**data**: numpy.ndarray, shape (n_samples, n_features). Data, where n_samples is the number of samples
2337
and n_features is the number of features.
2438

25-
**X, Y and condition_set**: column indices of data.
39+
**method**: string, "mv_fisherz".
40+
41+
**kwargs**: e.g., :code:`cache_path`. See :ref:`Advanced Usages <Advanced Usages>`.
2642

2743
Returns
2844
----------------

docs/source/search_methods_index/Constraint-based causal discovery methods/PC.rst

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,14 +35,41 @@ Usage
3535
3636
Visualization using pydot is recommended. If specific label names are needed, please refer to this `usage example <https://github.com/cmu-phil/causal-learn/blob/main/tests/TestGraphVisualization.py>`_ (e.g., 'cg.draw_pydot_graph(labels=["A", "B", "C"])' or 'GraphUtils.to_pydot(cg.G, labels=["A", "B", "C"])').
3737

38+
+++++++++++++++
39+
Advanced Usages
40+
+++++++++++++++
41+
+ If you would like to specify parameters for the (conditional) independence test (if available), you may directly pass the parameters to the :code:`pc` call. E.g.,
42+
43+
.. code-block:: python
44+
45+
from causallearn.search.ConstraintBased.PC import pc
46+
from causallearn.utils.cit import kci
47+
cg = pc(data, 0.05, kci, kernelZ='Polynomial', approx=False, est_width='median', ...)
48+
49+
+ If your graph is big and/or your independence test is slow (e.g., KCI), you may want to cache the p-value results to a local checkpoint. Then by reading values from this local checkpoint, no more repeated calculation will be wasted to resume from checkpoint / just finetune some PC parameters. This can be achieved by specifying :code:`cache_path`. E.g.,
50+
51+
.. code-block:: python
52+
53+
citest_cache_file = "/my/path/to/citest_cache_dataname_kci.json" # .json file
54+
cg1 = pc(data, 0.05, kci, cache_path=citest_cache_file) # after the long run
55+
56+
# just finetune uc_rule. p-values are reused, and thus cg2 is done in almost no time.
57+
cg2 = pc(data, 0.05, kci, cache_path=citest_cache_file, uc_rule=1)
58+
..
59+
60+
If :code:`cache_path` does not exist in your local file system, a new one will be created. Otherwise, the cache will be first loaded from the json file to the CIT class and used during the runtime. Note that 1) data hash and parameters hash will first be checked at loading to ensure consistency, and 2) during runtime, the cache will be saved to the local file every 30 seconds.
61+
62+
+ The above advanced usages also apply to other constraint-based methods, e.g., FCI and CDNOD.
63+
64+
3865
Parameters
3966
-------------------
4067
**data**: numpy.ndarray, shape (n_samples, n_features). Data, where n_samples is the number of samples
4168
and n_features is the number of features.
4269

4370
**alpha**: desired significance level (float) in (0, 1). Default: 0.05.
4471

45-
**indep_test**: Independence test method function. Default: 'fisherz'.
72+
**indep_test**: string, name of the independence test method. Default: 'fisherz'.
4673
- ":ref:`fisherz <Fisher-z test>`": Fisher's Z conditional independence test.
4774
- ":ref:`chisq <Chi-Square test>`": Chi-squared conditional independence test.
4875
- ":ref:`gsq <G-Square test>`": G-squared conditional independence test.

0 commit comments

Comments
 (0)