You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -182,9 +182,9 @@ settings:
182
182
large_chunk_size: 1000 # Number of load scenarios processed before saving
183
183
overwrite: true # If true, overwrites existing files, if false, appends to files
184
184
mode: "pf" # Mode of the script; options: pf, opf. pf: power flow data where one or more operating limits – the inequality constraints defined in OPF, e.g., voltage magnitude or branch limits – may be violated. opf: generates datapoints for training OPF solvers, with cost-optimal dispatches that satisfy all operating limits (OPF-feasible)
185
-
include_dc_res: true # If true, also stores the results of dc power flow (in addition to the results AC power flow). does not work with mode "opf"
185
+
include_dc_res: true # If true, also stores the results of dc power flow or dc optimal power flow
186
186
enable_solver_logs: true # If true, write OPF/PF logs to {data_dir}/solver_log; PF fast and DCPF fast do not log.
187
-
pf_fast: true # Whether to use fast PF solver by default (compute_ac_pf from powermodels.jl); if false, uses Ipopt-based PF. Some networks e.g. case10000_goc do not work with pf_fast: true. pf_fast is faster and more accurate than the Ipopt-based PF.
187
+
pf_fast: true # Whether to use fast PF solver by default (compute_ac_pf from powermodels.jl); if false, uses Ipopt-based PF. Some networks (typically large ones e.g. case10000_goc) do not work with pf_fast: true. pf_fast is faster and more accurate than the Ipopt-based PF.
188
188
dcpf_fast: true # Whether to use fast DCPF solver by default (compute_dc_pf from PowerModels.jl)
189
189
max_iter: 200 # Max iterations for Ipopt-based solvers
This command reads `bus_data.parquet`, normalizes power columns by `sn_mva`, and writes violin plots named `distribution_{feature_name}.png` to the output directory for quick visualization of feature distributions.
93
-
94
-
## Validation Checks
95
-
96
-
The validation command performs the following checks:
97
-
98
-
### Y-Bus Consistency
99
-
- Consistency of bus admittance matrix with branch admittance data
100
-
- Y-bus matrix structure validation
101
-
102
-
### Branch Constraints
103
-
- Deactivated lines have zero power flows and admittances
104
-
- Computed vs stored power flow consistency
105
-
- Branch loading limits (OPF mode only)
106
-
107
-
### Generator Constraints
108
-
- Deactivated generators have zero power output
109
-
- Generator power limits validation
110
-
- Reactive power limits (OPF mode only)
111
-
112
-
### Power Balance
113
-
- Bus generation consistency between bus_data and gen_data
Admittance perturbations introduce changes to line admittance values by applying random scaling factors to the resistance ($R$) and reactance ($X$) parameters of grid lines. Admittance ($Y$) is related to impedance ($Z$) through $Y=1/Z$, and the impedance, in turn, is related to resistance and reactance through $Z=R+jX$. This results in more variance and diversity in power flow solutions which is beneficial for training ML models to improve generalization. Admittance perturbations are applied to the existing topology and generation perturbations.
4
+
Admittance perturbations introduce changes to branch admittance values by applying random scaling factors to the resistance ($R$) and reactance ($X$) parameters of grid branches. This results in more variance and diversity in power flow solutions which is beneficial for training ML models to improve generalization.
5
5
6
6
The module provides two options for admittance perturbation strategies:
7
7
8
-
-`NoAdmittancePerturbationGenerator` yields the original example produced by the generation perturbation generator without any additional changes in line admittances.
8
+
-`NoAdmittancePerturbationGenerator` yields the original example without any additional changes in branch admittances.
9
9
10
-
-`PerturbAdmittanceGenerator` applies a scaling factor to all resistance and reactance values of network lines. The scaling factor is sampled from a uniform distribution with a range given by `[max(0, 1-sigma), 1+sigma)`, where `sigma` is a user-defined adjustable parameter.
10
+
-`PerturbAdmittanceGenerator` applies a scaling factor to all resistance and reactance values of network branches. The scaling factor is sampled from a uniform distribution with a range given by `[max(0, 1-sigma), 1+sigma)`, where `sigma` is a user-defined adjustable parameter.
Copy file name to clipboardExpand all lines: docs/manual/generation_perturbations.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ Generation perturbations introduce random changes to the cost functions of gener
5
5
6
6
The module provides three options for generation perturbation strategies:
7
7
8
-
-`NoGenPerturbationGenerator` yields the original example produced by the topology perturbation generator without any additional changes in generation cost.
8
+
-`NoGenPerturbationGenerator` yields the original example without any additional changes in generation cost.
9
9
10
10
-`PermuteGenCostGenerator` randomly permutes the generator cost coefficients across and among generator elements.
Copy file name to clipboardExpand all lines: docs/manual/getting_started.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -91,17 +91,17 @@ The `mode` parameter controls how the power flow scenarios are generated and val
91
91
-**Constraints**: Since the topology perturbations are performed after solving OPF, the inequality constraints of OPF (e.g. branch loading, voltage magnitude at PQ buses, generator bounds on reactive power, etc) might be violated.
92
92
-**Use Case**: Training data for power flow, contingency analysis, etc
93
93
-**Performance**: Faster as it avoids re-solving OPF for each perturbed scenario
94
-
-**PF Solver Choice**: Controlled by `settings.pf_fast`. If `true`, uses the fast `compute_ac_pf` path. If `false`, uses the Ipopt-based AC PF for higher fidelity at the cost of speed.
94
+
-**PF Solver Choice**: Controlled by `settings.pf_fast`. If `true`, uses the fast `compute_ac_pf` path. If `false`, uses the Ipopt-based AC PF which is slower for smaller grids but has better convergence properties for large grids.
95
95
96
96
## Data Validation
97
97
98
98
The generated data can be validated using the CLI validation command:
99
99
100
100
```bash
101
-
# Validate with default sampling (100 partitions)
101
+
# Validate with default sampling (100 partitions of 200 scenarios)
Copy file name to clipboardExpand all lines: docs/manual/outputs.md
+24-11Lines changed: 24 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,10 +31,13 @@ Metadata file containing the total number of scenarios (used for efficient parti
31
31
32
32
### Network Data Files
33
33
34
+
**Note**: All network data files are saved as partitioned parquet directories. Each file includes a `scenario_partition` column used for partitioning, which groups scenarios into partitions (default: 200 scenarios per partition).
35
+
34
36
#### `bus_data.parquet`
35
-
Bus-level features for each processed scenario. Columns (BUS_COLUMNS):
37
+
Bus-level features for each processed scenario. Columns:
36
38
37
-
-**scenario**: Index of the scenario (unique identifier of the power flow case)
39
+
-**scenario**: Global scenario index (unique identifier)
40
+
-**load_scenario_idx**: Index of the load scenario
38
41
-**bus**: Index of the bus
39
42
-**Pd**: Active power demand at the bus (MW)
40
43
-**Qd**: Reactive power demand at the bus (MVAr)
@@ -56,9 +59,10 @@ If `settings.include_dc_res=True`, also includes DC power flow columns (DC_BUS_C
56
59
-**Pg_dc**: DC active power generation at the bus (MW)
57
60
58
61
#### `gen_data.parquet`
59
-
Generator features per scenario. Columns (GEN_COLUMNS):
62
+
Generator features per scenario. Columns:
60
63
61
-
-**scenario**: Index of the scenario
64
+
-**scenario**: Global scenario index (unique identifier)
65
+
-**load_scenario_idx**: Index of the load scenario
62
66
-**idx**: Generator row index (0-based)
63
67
-**bus**: Bus index where the generator is connected
64
68
-**p_mw**: Active power output (MW)
@@ -77,9 +81,10 @@ If `settings.include_dc_res=True`, also includes DC generator column (DC_GEN_COL
77
81
-**p_mw_dc**: Active power from DC solution (MW)
78
82
79
83
#### `branch_data.parquet`
80
-
Branch features per scenario. Columns (BRANCH_COLUMNS):
84
+
Branch features per scenario. Columns:
81
85
82
-
-**scenario**: Index of the scenario
86
+
-**scenario**: Global scenario index (unique identifier)
87
+
-**load_scenario_idx**: Index of the load scenario
83
88
-**idx**: Branch row index (0-based)
84
89
-**from_bus**: Index of the source bus
85
90
-**to_bus**: Index of the destination bus
@@ -110,9 +115,10 @@ If `settings.include_dc_res=True`, also includes DC branch columns (DC_BRANCH_CO
110
115
-**pt_dc**: DC active power flow from destination to source (MW)
111
116
112
117
#### `y_bus_data.parquet`
113
-
Nonzero Y-bus entries per scenario with columns:
118
+
Nonzero Y-bus entries per scenario. Columns:
114
119
115
-
-**scenario**: Index of the scenario
120
+
-**scenario**: Global scenario index (unique identifier)
121
+
-**load_scenario_idx**: Index of the load scenario
116
122
-**index1**: Row index in the Y-bus matrix
117
123
-**index2**: Column index in the Y-bus matrix
118
124
-**G**: Conductance value (p.u.)
@@ -121,7 +127,14 @@ Nonzero Y-bus entries per scenario with columns:
121
127
### Runtime Data Files
122
128
123
129
#### `runtime_data.parquet`
124
-
Runtime data for each scenario (AC and DC solver execution times).
130
+
Runtime data for each scenario. Columns:
131
+
132
+
-**scenario**: Global scenario index (unique identifier)
133
+
-**load_scenario_idx**: Index of the load scenario
134
+
-**ac**: AC solver execution time (seconds)
135
+
136
+
If `settings.include_dc_res=True`, also includes DC runtime column (DC_RUNTIME_COLUMNS):
Copy file name to clipboardExpand all lines: docs/manual/topology_perturbations.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
## Overview
4
4
5
-
Topology perturbations generate variations of the original network by altering its structure. These variations simulate contingencies and component failures, and are useful for robustness testing, contingency analysis, and training ML models on diverse grid conditions.
5
+
Topology perturbations generate variations of the original network by altering its topology. These variations simulate contingencies and component failures, and are useful for robustness testing, contingency analysis, and training ML models on diverse grid conditions.
6
6
7
7
The module provides three topology perturbation strategies:
Copy file name to clipboardExpand all lines: scripts/compare_parquet_files.py
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -37,7 +37,7 @@
37
37
generation_perturbation:
38
38
type: "none" # Type of generation perturbation; options: cost_permutation, cost_perturbation, none
39
39
# WARNING: the following parameter is only used if type is "cost_permutation"
40
-
sigma: 1.0 # Size of range use for sampling scaling factor
40
+
sigma: 1.0 # Size of range used for sampling scaling factor
41
41
42
42
admittance_perturbation:
43
43
type: "none" # Type of admittance perturbation; options: random_perturbation, none
@@ -49,10 +49,10 @@
49
49
data_dir: "./testdelll" # Directory to save generated data relative to the project root
50
50
large_chunk_size: 1000 # Number of load scenarios processed before saving
51
51
overwrite: true # If true, overwrites existing files, if false, appends to files
52
-
mode: "pf" # Mode of the script; options: pf, opf. pf: power flow data where one or more operating limits – the inequality constraints defined in OPF, e.g., voltage magnitude or branch limits – may be violated. opf: datapoints for training OPF solvers, with cost-optimal dispatches that satisfy all operating limits (OPF-feasible)
53
-
include_dc_res: true # If true, also stores the results of dc power flow (in addition to the results AC power flow). does not work with mode "opf"
52
+
mode: "pf" # Mode of the script; options: pf, opf. pf: power flow data where one or more operating limits – the inequality constraints defined in OPF, e.g., voltage magnitude or branch limits – may be violated. opf: generates datapoints for training OPF solvers, with cost-optimal dispatches that satisfy all operating limits (OPF-feasible)
53
+
include_dc_res: true # If true, also stores the results of dc power flow or dc optimal power flow
54
54
enable_solver_logs: true # If true, write OPF/PF logs to {data_dir}/solver_log; PF fast and DCPF fast do not log.
55
-
pf_fast: true # Whether to use fast PF solver by default (compute_ac_pf from powermodels.jl); if false, uses Ipopt-based PF. Some networks e.g. case10000_goc do not work with pf_fast: true. pf_fast is faster and more accurate than the Ipopt-based PF.
55
+
pf_fast: true # Whether to use fast PF solver by default (compute_ac_pf from powermodels.jl); if false, uses Ipopt-based PF. Some networks (typically large ones e.g. case10000_goc) do not work with pf_fast: true. pf_fast is faster and more accurate than the Ipopt-based PF.
56
56
dcpf_fast: true # Whether to use fast DCPF solver by default (compute_dc_pf from PowerModels.jl)
57
57
max_iter: 200 # Max iterations for Ipopt-based solvers
Copy file name to clipboardExpand all lines: scripts/config/Texas2k_case1_2016summerpeak.yaml
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ topology_perturbation:
27
27
generation_perturbation:
28
28
type: "cost_permutation"# Type of generation perturbation; options: cost_permutation, cost_perturbation, none
29
29
# WARNING: the following parameter is only used if type is "cost_permutation"
30
-
sigma: 1.0# Size of range use for sampling scaling factor
30
+
sigma: 1.0# Size of range used for sampling scaling factor
31
31
32
32
admittance_perturbation:
33
33
type: "random_perturbation"# Type of admittance perturbation; options: random_perturbation, none
@@ -39,9 +39,9 @@ settings:
39
39
data_dir: "./baseline_perturbations"# Directory to save generated data relative to the project root
40
40
large_chunk_size: 10000# Number of load scenarios processed before saving
41
41
overwrite: true # If true, overwrites existing files, if false, appends to files
42
-
mode: "pf"# Mode of the script; options: pf, opf. pf: power flow data where one or more operating limits – the inequality constraints defined in OPF, e.g., voltage magnitude or branch limits – may be violated. opf: datapoints for training OPF solvers, with cost-optimal dispatches that satisfy all operating limits (OPF-feasible)
43
-
include_dc_res: true # If true, also stores the results of dc power flow (in addition to the results AC power flow). does not work with mode "opf"
42
+
mode: "pf"# Mode of the script; options: pf, opf. pf: power flow data where one or more operating limits – the inequality constraints defined in OPF, e.g., voltage magnitude or branch limits – may be violated. opf: generates datapoints for training OPF solvers, with cost-optimal dispatches that satisfy all operating limits (OPF-feasible)
43
+
include_dc_res: true # If true, also stores the results of dc power flow or dc optimal power flow
44
44
enable_solver_logs: false # If true, write OPF/PF logs to {data_dir}/solver_log; PF fast and DCPF fast do not log.
45
-
pf_fast: true # Whether to use fast PF solver by default (compute_ac_pf from powermodels.jl); if false, uses Ipopt-based PF. Some networks e.g. case10000_goc do not work with pf_fast: true. pf_fast is faster and more accurate than the Ipopt-based PF.
45
+
pf_fast: true # Whether to use fast PF solver by default (compute_ac_pf from powermodels.jl); if false, uses Ipopt-based PF. Some networks (typically large ones e.g. case10000_goc) do not work with pf_fast: true. pf_fast is faster and more accurate than the Ipopt-based PF.
46
46
dcpf_fast: true # Whether to use fast DCPF solver by default (compute_dc_pf from PowerModels.jl)
47
47
max_iter: 200# Max iterations for Ipopt-based solvers
0 commit comments