You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refer to the sections [Network](network.md), [Load Scenarios](load_scenarios.md), and [Topology perturbations](topology_perturbations.md) for a description of the configuration parameters.
101
+
Refer to the sections Network, Load Scenarios, and Topology perturbations of the [documentation](https://gridfm.github.io/gridfm-datakit/) for a description of the configuration parameters.
70
102
71
103
Sample configuration files are provided in`scripts/config`, e.g. `default.yaml`:
72
104
73
105
```yaml
74
106
network:
75
107
name: "case24_ieee_rts"# Name of the power grid network (without extension)
76
-
source: "pglib" # Data source for the grid; options: pglib, pandapower, file
108
+
source: "pglib" # Data source for the grid; options: pglib, file
109
+
# WARNING: the following parameter is only used if source is "file"
77
110
network_dir: "scripts/grids" # if using source "file", this is the directory containing the network file (relative to the project root)
78
111
79
-
80
112
load:
81
113
generator: "agg_load_profile" # Name of the load generator; options: agg_load_profile, powergraph
82
114
agg_profile: "default" # Name of the aggregated load profile
83
-
scenarios: 200 # Number of different load scenarios to generate
115
+
scenarios: 10000 # Number of different load scenarios to generate
84
116
# WARNING: the following parameters are only used if generator is "agg_load_profile"
85
117
# if using generator "powergraph", these parameters are ignored
86
-
sigma: 0.05 # max local noise
118
+
sigma: 0.2 # max local noise
87
119
change_reactive_power: true # If true, changes reactive power of loads. If False, keeps the ones from the case file
88
120
global_range: 0.4 # Range of the global scaling factor. used to set the lower bound of the scaling factor
89
121
max_scaling_factor: 4.0 # Max upper bound of the global scaling factor
90
-
step_size: 0.025 # Step size when finding the upper bound of the global scaling factor
91
-
start_scaling_factor: 0.8 # Initial value of the global scaling factor
122
+
step_size: 0.1 # Step size when finding the upper bound of the global scaling factor
123
+
start_scaling_factor: 1.0 # Initial value of the global scaling factor
92
124
93
125
topology_perturbation:
94
126
type: "random" # Type of topology generator; options: n_minus_k, random, none
95
127
# WARNING: the following parameters are only used if type is not "none"
96
128
k: 1 # Maximum number of components to drop in each perturbation
97
-
n_topology_variants: 5 # Number of unique perturbed topologies per scenario
98
-
elements: ["line", "trafo", "gen", "sgen"] # elements to perturb options: line, trafo, gen, sgen
129
+
n_topology_variants: 20 # Number of unique perturbed topologies per scenario
130
+
elements: [branch, gen] # elements to perturb. options: branch, gen
99
131
100
132
generation_perturbation:
101
133
type: "cost_permutation" # Type of generation perturbation; options: cost_permutation, cost_perturbation, none
102
-
# WARNING: the following parameters are onlyused if type is "cost_perturbation"
134
+
# WARNING: the following parameter is only used if type is "cost_permutation"
103
135
sigma: 1.0 # Size of range use for sampling scaling factor
104
136
137
+
admittance_perturbation:
138
+
type: "random_perturbation" # Type of admittance perturbation; options: random_perturbation, none
139
+
# WARNING: the following parameter is only used if type is "random_perturbation"
140
+
sigma: 0.2 # Size of range used for sampling scaling factor
141
+
105
142
settings:
106
-
num_processes: 10 # Number of parallel processes to use
143
+
num_processes: 16 # Number of parallel processes to use
107
144
data_dir: "./data_out" # Directory to save generated data relative to the project root
108
-
large_chunk_size: 50 # Number of load scenarios processed before saving
109
-
no_stats: false # If true, disables statistical calculations
110
-
overwrite: true # If true, overwrites existing files, if false, appends to files (note that bus_params.csv, edge_params.csv, scenarios_{load.generator}.csv and scenarios_{load.generator}.html will still be overwritten)
111
-
mode: "pf" # Mode of the script; options: contingency, pf
145
+
large_chunk_size: 1000 # Number of load scenarios processed before saving
146
+
overwrite: true # If true, overwrites existing files, if false, appends to files
147
+
mode: "pf" # Mode of the script; options: pf, opf. pf: power flow data where one or more operating limits – the inequality constraints defined in OPF, e.g., voltage magnitude or branch limits – may be violated. opf: datapoints for training OPF solvers, with cost-optimal dispatches that satisfy all operating limits (OPF-feasible)
148
+
include_dc_res: true # If true, also stores the results of dc power flow (in addition to the results AC power flow). does not work with mode "opf"
149
+
enable_solver_logs: true # If true, write OPF/PF logs to {data_dir}/solver_log; PF fast and DCPF fast do not log.
150
+
pf_fast: true # Whether to use fast PF solver by default (compute_ac_pf from powermodels.jl); if false, uses Ipopt-based PF. Some networks e.g. case10000_goc do not work with pf_fast: true. pf_fast is faster and more accurate than the Ipopt-based PF.
151
+
dcpf_fast: true # Whether to use fast DCPF solver by default (compute_dc_pf from PowerModels.jl)
152
+
max_iter: 200 # Max iterations for Ipopt-based solvers
112
153
```
113
154
114
155
<br>
115
156
116
157
## Output Files
117
158
118
-
The data generation process produces several output files in the specified data directory:
159
+
The data generation process writes the following artifacts under:
160
+
`{settings.data_dir}/{network.name}/raw`
119
161
120
162
- **tqdm.log**: Progress bar log.
121
-
- **error.log**: Log of the errors raised during data generation.
122
-
- **args.log**: Copy of the config file used.
123
-
- **pf_node.csv**: Data related to the nodes (buses) in the network, such as voltage levels and power injections.
124
-
- **pf_edge.csv**: Branch admittance matrix for each pf case.
125
-
- **branch_idx_removed.csv**: List of the indices of the branches (lines and transformers) that got removed when perturbing the topologies.
126
-
- **edge_params.csv**: Branch admittance matrix and branch rate limits for the unperturbed topology.
127
-
- **bus_params.csv**: Parameters for the buses (voltage limits and the base voltage).
128
-
- **scenario_{args.load.generator}.csv**: Load element-level load profile obtained after using the load scenario generator.
129
-
- **scenario_{args.load.generator}.html**: Plots of the element-level load profile.
130
-
- **scenario_{args.load.generator}.log**: If generator is "agg_load_profile", stores the upper and lower bounds for the global scaling factor.
131
-
- **stats.csv**: Stats about the generated data.
132
-
- **stats_plot.html**: Plots of the stats about the generated data.
163
+
- **error.log**: Error messages captured during generation.
164
+
- **args.log**: YAML dump of the configuration used for this run.
165
+
- **scenarios_{generator}.parquet**: Load scenarios (per-element time series) produced by the selected load generator.
166
+
- **scenarios_{generator}.html**: Plot of the generated load scenarios.
167
+
- **scenarios_{generator}.log**: Generator-specific notes (e.g., bounds for the global scaling factor when using `agg_load_profile`).
168
+
- **n_scenarios.txt**: Metadata file containing the total number of scenarios (used for efficient partition management).
169
+
- **bus_data.parquet**: Bus-level features for each processed scenario, partitioned by `scenario_partition` (columns `BUS_COLUMNS` and, if`settings.include_dc_res=True`, also `DC_BUS_COLUMNS`).
170
+
- **gen_data.parquet**: Generator features per scenario, partitioned by `scenario_partition` (columns `GEN_COLUMNS`).
171
+
- **branch_data.parquet**: Branch features per scenario, partitioned by `scenario_partition` (columns `BRANCH_COLUMNS`).
172
+
- **y_bus_data.parquet**: Nonzero Y-bus entries per scenario, partitioned by `scenario_partition` with columns `[scenario, index1, index2, G, B]`.
173
+
- **runtime_data.parquet**: Runtime data for each scenario, partitioned by `scenario_partition` (AC and DC solver execution times).
0 commit comments