Skip to content

Commit 00a2602

Browse files
committed
Merge branch 'refs/heads/main' into feature/alchemy-store
2 parents 1fe5899 + 1ba0272 commit 00a2602

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+3138
-419
lines changed

.github/workflows/ci.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,11 @@ jobs:
2727
with:
2828
submodules: recursive
2929

30+
- name: Install system dependencies
31+
run: |
32+
sudo apt-get update
33+
sudo apt-get install -y libcairo2-dev pkg-config
34+
3035
- name: Install Poetry
3136
run: pip install poetry==2.1.1
3237

.pre-commit-config.yaml

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,6 @@
11
# See https://pre-commit.com for more information
22
# See https://pre-commit.com/hooks.html for more hooks
33
repos:
4-
- repo: https://github.com/pycqa/flake8
5-
rev: "7.1.1"
6-
hooks:
7-
- id: flake8
8-
additional_dependencies: [Flake8-pyproject]
9-
# stages: [push]
104

115
- repo: https://github.com/pre-commit/mirrors-mypy
126
rev: "v1.13.0"
Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# Quick-Start Template: A 3-Step Experimental Workflow
2+
3+
This directory provides a set of template configuration files for the `pysatl-experiment` framework. It is designed to serve as a starting point for conducting a robust, three-step comparison of statistical goodness-of-fit tests.
4+
5+
The goal of this workflow is to help you make a data-driven decision about which statistical test is best suited for your specific hypothesis and potential alternatives.
6+
7+
## The 3-Step Workflow
8+
9+
This template follows a complete research cycle. Each step answers a different question, and they are designed to be run in sequence.
10+
11+
---
12+
13+
### Step 1: Calibration
14+
File: `1_calculate_critical_values.json`
15+
16+
**The Question:** What are the correct decision thresholds for my chosen tests?
17+
18+
**The Purpose:** Before you can use any statistical test, you must establish its baseline critical values. These values depend on your null hypothesis (the distribution you are testing for), the sample sizes, and the desired significance levels (alpha). This step calibrates your "measurement tools."
19+
20+
**The Command:**
21+
```bash
22+
poetry run experiment build-and-run 1_calculate_critical_values
23+
```
24+
---
25+
### Step 2: Evaluation
26+
File: `2_evaluate_power.json`
27+
28+
**The Question:** How effective are these tests at detecting the specific deviations I care about?
29+
30+
**The Purpose:** This is the core scientific evaluation. You measure the **statistical power** of each test against one or more alternative hypotheses. A test with high power is very good at correctly rejecting the null hypothesis when it is indeed false. This step tells you which test is most effective for your specific problem.
31+
32+
*This experiment requires the critical values generated in Step 1.*
33+
34+
**The Command:**
35+
```bash
36+
poetry run experiment build-and-run 2_evaluate_power
37+
```
38+
39+
---
40+
41+
### Step 3: Performance Analysis
42+
File: `3_measure_time_complexity.json`
43+
44+
**The Question:** What is the computational cost of using each test?
45+
46+
**The Purpose:** This is the practical, engineering evaluation. If two tests show similar power, the faster one may be preferable, especially in data-intensive applications. This experiment measures the execution time of each test, helping you understand the trade-offs between effectiveness and performance.
47+
48+
**The Command:**
49+
```bash
50+
poetry run experiment build-and-run 3_measure_time_complexity
51+
```
52+
53+
---
54+
55+
## How to Adapt This Template for Your Own Research
56+
57+
To use these files for your own experiment, copy this directory and modify the JSON configurations. The key is to ensure your parameters are consistent across the files.
58+
#### 1. Define Your Null Hypothesis
59+
60+
In all three files (`1_...`, `2_...`, `3_...`), change the `hypothesis` object to match the distribution you want to test for.
61+
62+
**Example: Changing from `Normal` to `Weibull`**
63+
```json
64+
"hypothesis": "normal"
65+
to
66+
"hypothesis": "weibull"
67+
```
68+
69+
#### 2. Choose the Criteria to Compare
70+
71+
In all three files, update the `criteria` list to include the tests you want to evaluate.
72+
73+
**Example: Comparing `Liliefos` and `Chi-Squared`**
74+
```json
75+
"criteria": [
76+
{
77+
"criterion_code": "LILLIE",
78+
"parameters": []
79+
},
80+
{
81+
"criterion_code": "CHI2",
82+
"parameters": []
83+
}
84+
],
85+
```
86+
#### 3. Define the Alternative Hypothesis
87+
88+
In `2_evaluate_power.json`, modify the `alternatives` array. This is the specific deviation you want your tests to be able to detect.
89+
90+
**Example: Checking if tests for `Weibull` can detect an `Exponential` distribution**
91+
```json
92+
"alternatives": [
93+
{
94+
"generator_name": "EXPONENTIALGENERATOR",
95+
"parameters": [
96+
0.5,
97+
]
98+
}
99+
]
100+
```
101+
#### 4. Adjust Experiment Parameters
102+
103+
You can also change `sample_sizes`, `monte_carlo_count`, and `significance_levels` to fit the scope of your research. Just ensure the `sample_sizes` and `significance_levels` are consistent between the `critical_value` and `power` experiments.
104+
105+
By following this structured, three-step approach, you can systematically evaluate and select the best statistical test for any given problem.
106+
107+
---
108+
109+
## How to Run This Example Workflow
110+
111+
The `pysatl-experiment` command-line tool is designed to look for experiment configurations in a dedicated `.experiment` directory at the root of the project.
112+
113+
Therefore, to run these examples, you must first copy the JSON template files into that location.
114+
115+
**1. Copy the configuration files:**
116+
117+
Firstly you need create files
118+
```bash
119+
poerty run experiment create <1_calculate_critical_values>
120+
```
121+
122+
**2. Run the experiments in sequence:**
123+
124+
Once the files are in the `.experiment` directory, you can execute the workflow steps in order. Make sure you are in the project's root directory.
125+
### First, run the calibration step
126+
```
127+
poetry run experiment build-and-run 1_calculate_critical_values
128+
```
129+
### Second, run the power evaluation
130+
```
131+
poetry run experiment build-and-run 2_evaluate_power
132+
```
133+
### Finally, run the performance analysis
134+
```
135+
poetry run experiment build-and-run 3_measure_time_complexity
136+
```
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
{
2+
"config": {
3+
"name": "1_calculate_critical_values",
4+
"experiment_type": "critical_value",
5+
"hypothesis": "normal",
6+
"criteria": [
7+
{
8+
"criterion_code": "KS",
9+
"parameters": []
10+
},
11+
{
12+
"criterion_code": "AD",
13+
"parameters": []
14+
}
15+
],
16+
"executor_type": "standard",
17+
"generator_type": "standard",
18+
"report_builder_type": "standard",
19+
"report_mode": "with-chart",
20+
"run_mode": "reuse",
21+
"monte_carlo_count": 1000,
22+
"sample_sizes": [
23+
200,
24+
400,
25+
600,
26+
800,
27+
1000
28+
],
29+
"significance_levels": [
30+
0.01,
31+
0.05,
32+
0.1
33+
],
34+
"storage_connection": "Here must be your path. Please, configure it in CLI"
35+
}
36+
}
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
{
2+
"config": {
3+
"name": "2_evaluate_power",
4+
"experiment_type": "power",
5+
"hypothesis": "normal",
6+
"alternatives": [
7+
{
8+
"generator_name": "NORMALGENERATOR",
9+
"parameters": [
10+
0.5,
11+
1
12+
]
13+
}
14+
],
15+
"criteria": [
16+
{
17+
"criterion_code": "KS",
18+
"parameters": []
19+
},
20+
{
21+
"criterion_code": "AD",
22+
"parameters": []
23+
}
24+
],
25+
"executor_type": "standard",
26+
"generator_type": "standard",
27+
"report_builder_type": "standard",
28+
"report_mode": "with-chart",
29+
"run_mode": "reuse",
30+
"monte_carlo_count": 1000,
31+
"sample_sizes": [
32+
200,
33+
400,
34+
600,
35+
800,
36+
1000
37+
],
38+
"significance_levels": [
39+
0.01,
40+
0.05,
41+
0.1
42+
],
43+
"storage_connection": "Here must be your path. Please, configure it in CLI"
44+
}
45+
}
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
{
2+
"config": {
3+
"name": "3_measure_time_complexity",
4+
"experiment_type": "time_complexity",
5+
"hypothesis": "normal",
6+
"criteria": [
7+
{
8+
"criterion_code": "KS",
9+
"parameters": []
10+
},
11+
{
12+
"criterion_code": "AD",
13+
"parameters": []
14+
}
15+
],
16+
"executor_type": "standard",
17+
"generator_type": "standard",
18+
"report_builder_type": "standard",
19+
"report_mode": "with-chart",
20+
"run_mode": "reuse",
21+
"monte_carlo_count": 1000,
22+
"sample_sizes": [
23+
200,
24+
400,
25+
600,
26+
800,
27+
1000
28+
],
29+
"significance_levels": [
30+
0.01,
31+
0.05,
32+
0.1
33+
],
34+
"storage_connection": "Here must be your path. Please, configure it in CLI"
35+
}
36+
}
150 KB
Binary file not shown.
122 KB
Binary file not shown.
114 KB
Binary file not shown.

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ rich = "==13.9.4"
2525
click = ">=8.2.1"
2626
dacite = "==1.9.2"
2727
line_profiler = "5.0.0"
28+
pydantic = "^2.11.9"
2829
pysatl-criterion = {path = "./pysatl_criterion"}
2930

3031
[tool.poetry.group.dev.dependencies]

0 commit comments

Comments
 (0)