LearningToOptimize · andrewrosemberg · Mar 3, 2025 · Mar 3, 2025 · Mar 3, 2025 · Mar 3, 2025
diff --git a/Project.toml b/Project.toml
@@ -1,7 +1,7 @@
 name = "LearningToOptimize"
 uuid = "e1d8bfa7-c465-446a-84b9-451470f6e76c"
 authors = ["andrewrosemberg <[email protected]> and contributors"]
-version = "1.0.0"
+version = "1.1.0"
 
 [deps]
 Arrow = "69666777-d1a9-59fb-9406-91d4454c9d45"
@@ -11,6 +11,7 @@ Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
 Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
 JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
 LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
+MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
 MLJFlux = "094fc8d1-fd35-5302-93ea-dabda2abf845"
 NNlib = "872c559c-99b0-510c-b3b7-b6c96a88d5cd"
 Optimisers = "3bd65402-5787-11e9-1adc-39752487f4e2"
@@ -25,14 +26,15 @@ Arrow = "2"
 CSV = "0.10"
 DataFrames = "1"
 Distributions = "0.25"
-Flux = "0.14"
+Flux = "0.14, 0.16"
 JuMP = "1"
 MLJFlux = "0.6"
+MLJ = "0.20"
 NNlib = "0.9"
-Optimisers = "0.3"
-ParametricOptInterface = "0.8"
+Optimisers = "0.3, 0.4"
+ParametricOptInterface = "0.8, 0.9"
 Statistics = "1"
-Zygote = "0.6.68"
+Zygote = "0.6.68, 0.7"
 julia = "1.9"
 
 [extras]
@@ -41,10 +43,9 @@ Clarabel = "61c947e1-3e6d-4ee4-985a-eec8c727bd6e"
 DelimitedFiles = "8bb1440f-4735-579b-a4ab-409b98df4dab"
 HiGHS = "87dc4568-4c63-4d18-b0c0-bb2238e4078b"
 Ipopt = "b6b21f68-93f8-5de0-b562-5493be1d77c9"
-MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
 PGLib = "07a8691f-3d11-4330-951b-3c50f98338be"
 PowerModels = "c36e90e8-916a-50a6-bd94-075b64ef4655"
 Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
 
 [targets]
-test = ["Test", "DelimitedFiles", "PGLib", "HiGHS", "PowerModels", "Clarabel", "Ipopt", "MLJ"]
+test = ["Test", "DelimitedFiles", "PGLib", "HiGHS", "PowerModels", "Clarabel", "Ipopt"]
diff --git a/README.md b/README.md
@@ -6,7 +6,7 @@
     </div>
 </div>
 
-Learning to optimize (LearningToOptimize) package that provides basic functionalities to help fit proxy models for optimization.
+Learning to optimize (LearningToOptimize) package that provides basic functionalities to help fit proxy models for parametric optimization problems.
 
 Have a look at our sister [HugginFace Organization](https://huggingface.co/LearningToOptimize), for datasets, pre-trained models and benchmarks.
 
@@ -19,6 +19,34 @@ Have a look at our sister [HugginFace Organization](https://huggingface.co/Learn
 
 ![flowchart](docs/src/assets/L2O.png)
 
+# Background
+
+Parametric optimization problems arise in scenarios where certain elements (e.g., coefficients, constraints) may vary according to problem parameters. A general form of a parameterized convex optimization problem is 
+
+$$
+\begin{aligned}
+&\min_{x} \quad f(x; \theta) \\
+&\text{subject to} \quad g_i(x; \theta) \leq 0, \quad i = 1,\dots, m \\
+&\quad\quad\quad\quad A(\theta)x = b(\theta)
+\end{aligned}
+$$
+
+where $ \theta $ is the parameter.
+
+**Learning to Optimize (L2O)** is an emerging paradigm where machine learning models *learn* to solve optimization problems efficiently. This approach is also known as using **optimization proxies** or **amortized optimization**. 
+
+In more technical terms, **amortized optimization** seeks to learn a function \\( f_\theta(x) \\) that maps problem parameters \\( x \\) to solutions \\( y \\) that (approximately) minimize a given objective function subject to constraints. Modern methods leverage techniques like **differentiable optimization layers**, **input-convex neural networks**, or constraint-enforcing architectures (e.g., [DC3](https://openreview.net/pdf?id=0Ow8_1kM5Z)) to ensure that the learned proxy solutions are both feasible and performant. By coupling the solver and the model in an **end-to-end** pipeline, these approaches let the training objective directly reflect downstream metrics, improving speed and reliability.
+
+Recent advances also focus on **trustworthy** or **certifiable** proxies, where constraint satisfaction or performance bounds are guaranteed. This is crucial in domains like energy systems or manufacturing, where infeasible solutions can have large penalties or safety concerns. Overall, learning-based optimization frameworks aim to combine the advantages of ML (data-driven generalization) with the rigor of mathematical programming (constraint handling and optimality).
+
+For a broader overview, see the [SIAM News article on trustworthy optimization proxies](https://www.siam.org/publications/siam-news/articles/fusing-artificial-intelligence-and-optimization-with-trustworthy-optimization-proxies/), which highlights the growing synergy between AI and classical optimization.
+
+# Installation
+
+```julia
+] add LearningToOptimize
+```
+
 ## Generate Dataset
 This package provides a basic way of generating a dataset of the solutions of an optimization problem by varying the values of the parameters in the problem and recording it.
 
@@ -62,7 +90,33 @@ Which creates the following CSV:
 |  9 | 9.0 |
 | 10 | 10.0|
 
-ps.: For illustration purpose, I have represented the id's here as integers, but in reality they are generated as UUIDs. 
+ps.: For illustration purpose, I have represented the id's here as integers, but in reality they are generated as UUIDs.
+
+To load the parameter values back:
+
+```julia
+problem_iterator = load("input_file.csv", CSVFile)
+```
+
+### Samplers
+
+Instead of defining parameter instances manually, one may sample parameter values using pre-defined samplers - e.g. `scaled_distribution_sampler`, `box_sampler`- or define their own sampler. Samplers are functions that take a vector of parameter of type `MOI.Parameter` and return a matrix of parameter values.
+
+The easiest way to go from problem definition, sampling parameter values and saving them is to use the `general_sampler` function: 
+
+```julia
+general_sampler(
+    "examples/powermodels/data/6468_rte/6468_rte_SOCWRConicPowerModel_POI_load.mof.json";
+    samplers = [
+        (original_parameters) -> scaled_distribution_sampler(original_parameters, 10000),
+        (original_parameters) -> line_sampler(original_parameters, 1.01:0.01:1.25),
+        (original_parameters) -> box_sampler(original_parameters, 300),
+    ],
+)
+```
+
+This function is a general sampler that uses a set of samplers to sample the parameter space. 
+It loads the underlying model from a passed `file` that works with JuMP's `read_from_file` (ps.: currently only tested with `MathOptFormat`), samples the parameters and saves the sampled parameters to `save_file`.
 
 ### The Recorder
 
@@ -104,13 +158,15 @@ recorder = Recorder{ArrowFile}("output_file.arrow", primal_variables=[x], dual_v
 In order to train models to be able to forecast optimization solutions from parameter values, one option is to use the package Flux.jl:
 
 ```julia
+using CSV, DataFrames, Flux
+
 # read input and output data
 input_data = CSV.read("input_file.csv", DataFrame)
 output_data = CSV.read("output_file.csv", DataFrame)
 
 # Separate input and output variables
-output_variables = output_data[!, Not(:id)]
-input_features = innerjoin(input_data, output_data[!, [:id]], on = :id)[!, Not(:id)] # just use success solves
+output_variables = output_data[!, Not([:id, :status, :primal_status, :dual_status, :objective, :time])] # just predict solutions
+input_features = innerjoin(input_data, output_data[!, [:id]]; on=:id)[!, Not(:id)] # just use success solves
 
 # Define model
 model = Chain(
@@ -136,6 +192,38 @@ Flux.train!(loss, Flux.params(model), [(input_features, output_variables)], opti
 predictions = model(input_features)
 ```
 
+Another option is to use the package MLJ.jl:
+
+```julia
+using MLJ
+
+# Define the model
+model = MultitargetNeuralNetworkRegressor(;
+    builder=FullyConnectedBuilder([64, 32]),
+    rng=123,
+    epochs=20,
+    optimiser=Optimisers.Adam(),
+)
+
+# Train the model
+mach = machine(model, input_features, output_variables)
+fit!(mach; verbosity=2)
+
+# Make predictions
+predict(mach, input_features)
+
+```
+
+### Evaluating the ML model
+
+For ease of use, we built a general evaluator that can be used to evaluate the model.
+It will return a `NamedTuple` with the objective value and infeasibility of the 
+predicted solution for each instance, and the overall inference time and allocated memory.
+
+```julia
+evaluation = general_evaluator(problem_iterator, mach)
+```
+
 ## Coming Soon
 
 Future features:

diff --git a/docs/make.jl b/docs/make.jl
@@ -1,24 +1,30 @@
 using LearningToOptimize
 using Documenter
 
-DocMeta.setdocmeta!(LearningToOptimize, :DocTestSetup, :(using LearningToOptimize); recursive=true)
+DocMeta.setdocmeta!(
+    LearningToOptimize,
+    :DocTestSetup,
+    :(using LearningToOptimize);
+    recursive = true,
+)
 
 makedocs(;
-    modules=[LearningToOptimize],
-    authors="andrewrosemberg <[email protected]> and contributors",
-    repo="https://github.com/andrewrosemberg/LearningToOptimize.jl/blob/{commit}{path}#{line}",
-    sitename="LearningToOptimize.jl",
-    format=Documenter.HTML(;
-        prettyurls=get(ENV, "CI", "false") == "true",
-        canonical="https://andrewrosemberg.github.io/LearningToOptimize.jl",
-        edit_link="main",
-        assets=String[],
+    modules = [LearningToOptimize],
+    authors = "andrewrosemberg <[email protected]> and contributors",
+    repo = "https://github.com/andrewrosemberg/LearningToOptimize.jl/blob/{commit}{path}#{line}",
+    sitename = "LearningToOptimize.jl",
+    format = Documenter.HTML(;
+        prettyurls = get(ENV, "CI", "false") == "true",
+        canonical = "https://andrewrosemberg.github.io/LearningToOptimize.jl",
+        edit_link = "main",
+        assets = String[],
     ),
-    pages=["Home" => "index.md",
+    pages = [
+        "Home" => "index.md",
         "Arrow" => "arrow.md",
         "Parameter Type" => "parametertype.md",
         "API" => "api.md",
     ],
 )
 
-deploydocs(; repo="github.com/andrewrosemberg/LearningToOptimize.jl", devbranch="main")
+deploydocs(; repo = "github.com/andrewrosemberg/LearningToOptimize.jl", devbranch = "main")
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -71,7 +71,33 @@ Which creates the following CSV:
 |  9 | 9.0 |
 | 10 | 10.0|
 
-ps.: For illustration purpose, I have represented the id's here as integers, but in reality they are generated as UUIDs. 
+ps.: For illustration purpose, I have represented the id's here as integers, but in reality they are generated as UUIDs.
+
+To load the parameter values back:
+
+```julia
+problem_iterator = load("input_file.csv", CSVFile)
+```
+
+### Samplers
+
+Instead of defining parameter instances manually, one may sample parameter values using pre-defined samplers - e.g. `scaled_distribution_sampler`, `box_sampler`- or define their own sampler. Samplers are functions that take a vector of parameter of type `MOI.Parameter` and return a matrix of parameter values.
+
+The easiest way to go from problem definition, sampling parameter values and saving them is to use the `general_sampler` function: 
+
+```julia
+general_sampler(
+    "examples/powermodels/data/6468_rte/6468_rte_SOCWRConicPowerModel_POI_load.mof.json";
+    samplers = [
+        (original_parameters) -> scaled_distribution_sampler(original_parameters, 10000),
+        (original_parameters) -> line_sampler(original_parameters, 1.01:0.01:1.25),
+        (original_parameters) -> box_sampler(original_parameters, 300),
+    ],
+)
+```
+
+This function is a general sampler that uses a set of samplers to sample the parameter space. 
+It loads the underlying model from a passed `file` that works with JuMP's `read_from_file` (ps.: currently only tested with `MathOptFormat`), samples the parameters and saves the sampled parameters to `save_file`.
 
 ### The Recorder
 
@@ -113,13 +139,15 @@ recorder = Recorder{ArrowFile}("output_file.arrow", primal_variables=[x], dual_v
 In order to train models to be able to forecast optimization solutions from parameter values, one option is to use the package Flux.jl:
 
 ```julia
+using CSV, DataFrames, Flux
+
 # read input and output data
 input_data = CSV.read("input_file.csv", DataFrame)
 output_data = CSV.read("output_file.csv", DataFrame)
 
 # Separate input and output variables
-output_variables = output_data[!, Not(:id)]
-input_features = innerjoin(input_data, output_data[!, [:id]], on = :id)[!, Not(:id)] # just use success solves
+output_variables = output_data[!, Not([:id, :status, :primal_status, :dual_status, :objective, :time])] # just predict solutions
+input_features = innerjoin(input_data, output_data[!, [:id]]; on=:id)[!, Not(:id)] # just use success solves
 
 # Define model
 model = Chain(
@@ -145,18 +173,40 @@ Flux.train!(loss, Flux.params(model), [(input_features, output_variables)], opti
 predictions = model(input_features)
 ```
 
-## Coming Soon
+Another option is to use the package MLJ.jl:
 
-Future features:
- - ML objectives that penalize infeasible predictions;
- - Warm-start from predicted solutions.
+```julia
+using MLJ
+
+# Define the model
+model = MultitargetNeuralNetworkRegressor(;
+    builder=FullyConnectedBuilder([64, 32]),
+    rng=123,
+    epochs=20,
+    optimiser=Optimisers.Adam(),
+)
+
+# Train the model
+mach = machine(model, input_features, output_variables)
+fit!(mach; verbosity=2)
 
+# Make predictions
+predict(mach, input_features)
 
-<!-- ```@index
+```
 
-``` -->
+### Evaluating the ML model
 
+For ease of use, we built a general evaluator that can be used to evaluate the model.
+It will return a `NamedTuple` with the objective value and infeasibility of the 
+predicted solution for each instance, and the overall inference time and allocated memory.
 
-<!-- ```@autodocs
-Modules = [LearningToOptimize]
-``` -->
+```julia
+evaluation = general_evaluator(problem_iterator, mach)
+```
+
+## Coming Soon
+
+Future features:
+ - ML objectives that penalize infeasible predictions;
+ - Warm-start from predicted solutions.