Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
name = "ApplicationDrivenLearning"
uuid = "0856f1c8-ef17-4e14-9230-2773e47a789e"
authors = ["Giovanni Amorim", "Joaquim Garcia"]
version = "0.1.5"
version = "0.1.6"

[deps]
BilevelJuMP = "485130c0-026e-11ea-0f1a-6992cd14145c"
DiffOpt = "930fe3bc-9c6b-11ea-2d94-6184641e85e7"
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
Functors = "d9f16b24-f501-4c13-a1f2-28368ffc5196"
JobQueueMPI = "32d208e1-246e-420c-b6ff-18b71b410923"
JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
Expand All @@ -20,6 +21,7 @@ Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
BilevelJuMP = "0.6.2"
DiffOpt = "0.5.0"
Flux = "0.16.3"
Functors = "0.5.2"
JobQueueMPI = "0.1.1"
JuMP = "1.24"
MPI = "0.20.22"
Expand Down
86 changes: 47 additions & 39 deletions docs/src/examples/newsvendor.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ i=2 \longrightarrow c=10; \quad q = 11;\quad r = 1

The demand series for both items will be generated using a discrete uniform distribution.

Let's start by loading the necessary packages and defining the data.
Let's start by loading the necessary packages and defining the problem parameters.

```julia
using Flux
Expand All @@ -47,8 +47,6 @@ Random.seed!(123)
c = [10, 10]
q = [19, 11]
r = [9, 1]
y_d = rand(10:100, (100, 2)) .|> Float32
x_d = ones(100, 1) .|> Float32
```

Now, we can initialize the application driven learning model, build the plan and assess models and set the forecast model.
Expand Down Expand Up @@ -91,11 +89,21 @@ pred = Flux.Dense(1 => 2, exp)
ADL.set_forecast_model(model, pred)
```

Then, we can initialize the data, referencing forecast variables.

```julia
x_d = ones(100, 1) .|> Float32
y_d = Dict(
d[1] => rand(10:100, 100) .|> Float32,
d[2] => rand(10:100, 100) .|> Float32
)
```

We can check how the model performs by computing the assess cost with the initial (random) forecast model.

```julia
julia> ADL.compute_cost(model, x_d, y_d)
-5.118482679128647
-3.571615f0
```

Now let's train the model using the GradientMode.
Expand All @@ -109,49 +117,49 @@ julia> gd_sol = ApplicationDrivenLearning.train!(
epochs=30
)
)
Epoch 1 | Time = 0.5s | Cost = -5.12
Epoch 2 | Time = 1.0s | Cost = -6.25
Epoch 3 | Time = 1.5s | Cost = -7.64
Epoch 4 | Time = 2.1s | Cost = -9.33
Epoch 5 | Time = 2.6s | Cost = -11.4
Epoch 6 | Time = 3.1s | Cost = -13.93
Epoch 7 | Time = 3.6s | Cost = -17.02
Epoch 8 | Time = 4.1s | Cost = -20.82
Epoch 9 | Time = 4.7s | Cost = -25.17
Epoch 10 | Time = 5.2s | Cost = -29.51
Epoch 11 | Time = 5.7s | Cost = -33.42
Epoch 12 | Time = 6.3s | Cost = -37.56
Epoch 13 | Time = 6.8s | Cost = -42.92
Epoch 14 | Time = 7.5s | Cost = -50.45
Epoch 15 | Time = 8.2s | Cost = -60.3
Epoch 16 | Time = 8.9s | Cost = -72.4
Epoch 17 | Time = 9.5s | Cost = -87.18
Epoch 18 | Time = 10.2s | Cost = -105.31
Epoch 19 | Time = 10.8s | Cost = -127.53
Epoch 20 | Time = 11.4s | Cost = -154.36
Epoch 21 | Time = 12.1s | Cost = -185.68
Epoch 22 | Time = 12.8s | Cost = -222.14
Epoch 23 | Time = 13.4s | Cost = -265.19
Epoch 24 | Time = 14.0s | Cost = -315.45
Epoch 25 | Time = 14.5s | Cost = -370.01
Epoch 26 | Time = 15.1s | Cost = -425.62
Epoch 27 | Time = 15.7s | Cost = -464.52
Epoch 28 | Time = 16.3s | Cost = -461.25
Epoch 29 | Time = 16.9s | Cost = -439.36
Epoch 30 | Time = 17.5s | Cost = -419.52
ApplicationDrivenLearning.Solution(-464.5160680770874, Real[1.6317965f0, 1.7067692f0, 2.7623773f0, 0.9124785f0])
Epoch 1 | Time = 0.4s | Cost = -3.57
Epoch 2 | Time = 0.8s | Cost = -4.36
Epoch 3 | Time = 1.2s | Cost = -5.33
Epoch 4 | Time = 1.6s | Cost = -6.51
Epoch 5 | Time = 2.0s | Cost = -7.95
Epoch 6 | Time = 2.4s | Cost = -9.72
Epoch 7 | Time = 2.8s | Cost = -11.88
Epoch 8 | Time = 3.2s | Cost = -14.53
Epoch 9 | Time = 3.6s | Cost = -17.78
Epoch 10 | Time = 4.0s | Cost = -21.77
Epoch 11 | Time = 4.4s | Cost = -26.68
Epoch 12 | Time = 4.8s | Cost = -32.73
Epoch 13 | Time = 5.2s | Cost = -39.52
Epoch 14 | Time = 5.6s | Cost = -46.64
Epoch 15 | Time = 6.0s | Cost = -54.15
Epoch 16 | Time = 6.4s | Cost = -62.95
Epoch 17 | Time = 6.8s | Cost = -74.74
Epoch 18 | Time = 7.2s | Cost = -90.36
Epoch 19 | Time = 7.6s | Cost = -110.35
Epoch 20 | Time = 8.0s | Cost = -135.12
Epoch 21 | Time = 8.4s | Cost = -164.34
Epoch 22 | Time = 8.8s | Cost = -197.82
Epoch 23 | Time = 9.2s | Cost = -237.1
Epoch 24 | Time = 9.6s | Cost = -282.57
Epoch 25 | Time = 10.0s | Cost = -334.87
Epoch 26 | Time = 10.4s | Cost = -389.66
Epoch 27 | Time = 10.8s | Cost = -442.7
Epoch 28 | Time = 11.2s | Cost = -469.92
Epoch 29 | Time = 11.6s | Cost = -452.58
Epoch 30 | Time = 12.0s | Cost = -430.85
ApplicationDrivenLearning.Solution(-469.91516f0, Real[1.6040976f0, 1.2566354f0, 2.8811285f0, 1.1966338f0])

julia> ADL.compute_cost(model, x_d, y_d)
-464.5160680770874
-469.91516f0
```

After training, we can check the cost of the solution found by the gradient mode and even analyze the predictions from it.

```julia
julia> model.forecast(x_d[1,:])
2-element Vector{Float32}:
80.977684
13.725394
2-element ApplicationDrivenLearning.VariableIndexedVector{Float32}:
88.69701
11.626293
```

As we can see, the forecast model overestimates the demand for the first item and underestimates the demand for the second item (both items average demand is 55), following the incentives from the model structure.
7 changes: 3 additions & 4 deletions docs/src/examples/scheduling.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,7 @@ We can check how the model performs by computing the assess cost with the initia

```julia
X = ones(2, 1) .|> Float32
Y = zeros(2, 1) .|> Float32
Y[2, 1] = 2.0
Y = Dict(y => [0.0, 2.0] .|> Float32)
set_optimizer(model, Gurobi.Optimizer)
set_silent(model)
```
Expand Down Expand Up @@ -121,10 +120,10 @@ Iter Function value √(Σ(yᵢ-ȳ)²)/n
* time: 0.1640000343322754
ApplicationDrivenLearning.Solution(29.99998f0, Real[0.6931467f0])

julia> model.forecast(X[1,:]) # previsão final
julia> model.forecast(X[1,:]) # final forecast
1-element Vector{Float32}:
1.999999

julia> ADL.compute_cost(model, X, Y) # custo final
julia> ADL.compute_cost(model, X, Y) # final cost
29.99998
```
1 change: 0 additions & 1 deletion docs/src/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,6 @@ ApplicationDrivenLearning.apply_gradient!
## Other functions

```@docs
forecast
compute_cost
train!
ApplicationDrivenLearning.build_plan_model_forecast_params
Expand Down
61 changes: 41 additions & 20 deletions docs/src/tutorials/custom_forecast.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,53 +4,74 @@ The basic approach to define a forecast model is to use a `Chain` from the `Flux

## Input-Output mapping

The connection between predictive model outputs and plan model inputs is not always a straightforward one. Because of this, the `set_forecast_model` function, used to define the predictive model in the ApplicationDrivenLearning.jl package, includes the `input_output_map` parameter.
This parameter allows users to declare an explicit mapping between the outputs produced by Flux models and the forecast variables used in the planning model. This is useful in contexts where the same prediction logic can be applied across several entities (such as production units or geographical locations), promoting model reuse and computational efficiency.
The connection between predictive model outputs and plan model inputs is not always a straightforward one. Because of this, the definition of a `PredictiveModel` structure, used to define the predictive model in the ApplicationDrivenLearning.jl package, includes the `input_output_map` parameter.
This parameter allows users to declare an explicit mapping between the outputs produced by Flux models and the forecast variables used in the planning model. This is useful in contexts where the same prediction logic can be applied across several entities (such as production units or geographical locations), promoting model reuse and computational and parameter efficiency.

Consider a scenario where the input dataset contains 3 predictive variables (for example expected temperature on location 1, expected temperature on location 2 and weekday), there are 2 forecast variables (energy demand on the two locations of interest) and the forecast model should use only the expected temperature of a location to predict it’s demand. That means we would make two predictions using the same model and concatenate those values. This can be easily achieved with a dictionary mapping the data input and forecast variable indexes.
Consider a scenario where the input dataset contains 3 variables (for example expected temperature on location 1, expected temperature on location 2 and weekday), there are 2 forecast variables (energy demand on the two locations of interest) and the forecast model should use only the expected temperature of a location to predict it’s demand. That means we would make two predictions using the same model and concatenate those values. This can be easily achieved with a dictionary mapping the data input and forecast variable indexes.

```julia
model = ApplicationDrivenLearning.Model()
@variable(model, demand[1:2], ApplicationDrivenLearning.Forecast)

X = [
76 89 2;
72 85 3
] # input dataset of size 2 by 3
] .|> Float32 # input dataset of size 2 by 3
Y = Dict(
demand[1] => [101, 89] .|> Float32,
demand[2] => [68, 49] .|> Float32
)
dem_forecast = Dense(
2 => 1
) # forecast model takes 2 inputs and outputs single value

input_output_map = Dict(
[1, 3] => [1], # input indexes 1 and 3 map to 1st forecast variable
[2, 3] => [2] # input indexes 2 and 3 map to 2nd forecast variable
[1, 3] => [demand[1]], # input indexes 1 and 3 map to 1st forecast variable
[2, 3] => [demand[2]] # input indexes 2 and 3 map to 2nd forecast variable
)
ApplicationDrivenLearning.set_forecast_model(model, dem_forecast, input_output_map)
predictive = PredictiveModel(dem_forecast, input_output_map)
ApplicationDrivenLearning.set_forecast_model(model, predictive)
```

## Multiple Flux models

The definition of the predictive model can also be done using multiple Flux models. This supports the modular construction of predictive architectures, where specialized components are trained to forecast different aspects of the problem, without the difficulty of defining custom architectures.

This can be achieved providing an array of model objects and an array of dictionaries as input-output mapping to the `set_forecast_model` function. Using the context from previous example, let’s assume there is an additional variable that has to be predicted to each location but not variable on time (that is, on dataset samples). This can be achieved defining an additional model that maps a constant input value to the correct output indexes.
This can be achieved providing an array of model objects and an array of dictionaries as input-output mapping to the `PredictiveModel` construction. Using the context from previous example, let’s assume we also want to predict price for each location using a single model that receives average lagged price. This can be achieved defining an additional model and mapping.

```julia
model = ApplicationDrivenLearning.Model()
@variables(model, begin
demand[1:2], ApplicationDrivenLearning.Forecast
price[1:2], ApplicationDrivenLearning.Forecast
end)

X = [
76 89 2 1;
72 85 3 1
] # input dataset of size 2 by 4
76 89 2 103;
72 85 3 89
] .|> Float32 # input dataset of size 2 by 4
Y = Dict(
demand[1] => [101, 89] .|> Float32,
demand[2] => [68, 49] .|> Float32,
price[1] => [101, 89] .|> Float32,
price[2] => [68, 49] .|> Float32,
)
dem_forecast = Dense(
2 => 1
) # demand forecast model takes 2 inputs and outputs single value
aux_forecast = Dense(
prc_forecast = Dense(
1 => 2
) # auxiliar forecast model takes 1 input and outputs 2 values
forecast_objs = [dem_forecast, aux_forecast]
) # price forecast model takes 1 input and outputs 2 values
forecast_objs = [dem_forecast, prc_forecast]
input_output_map = [
Dict(
[1, 3] => [1],
[2, 3] => [2]
), # input indexes 1,2,3 are used to compute forecast vars 1,2 with 1st Flux.Dense object
[1, 3] => [demand[1]],
[2, 3] => [demand[2]]
), # input indexes 1,2,3 are used to compute demand forecast vars separately with 1st Flux.Dense object
Dict(
[4] => [3, 4]
), # input index 4 is used to compute forecast vars 3,4 with 2nd Flux.Dense object
[4] => price
), # input index 4 is used to compute both price forecast vars with 2nd Flux.Dense object
]
ApplicationDrivenLearning.set_forecast_model(model, forecast_objs, input_output_map)
predictive = PredictiveModel(forecast_objs, input_output_map)
ApplicationDrivenLearning.set_forecast_model(model, predictive)
```
23 changes: 12 additions & 11 deletions docs/src/tutorials/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,6 @@ using Flux
import HiGHS
using ApplicationDrivenLearning

# data
X = reshape([1 1], (2, 1)) .|> Float32
Y = reshape([10 20], (2, 1)) .|> Float32

# main model and policy / forecast variables
model = ApplicationDrivenLearning.Model()
@variables(model, begin
Expand Down Expand Up @@ -53,6 +49,10 @@ end)
set_optimizer(model, HiGHS.Optimizer)
set_silent(model)

# data
X = reshape([1 1], (2, 1)) .|> Float32
Y = Dict(θ => [10, 20] .|> Float32)

# forecast model
nn = Chain(Dense(1 => 1; bias=false))
ApplicationDrivenLearning.set_forecast_model(model, nn)
Expand Down Expand Up @@ -89,13 +89,6 @@ We have to include a solver for solving the optimization models. In this case, w
using HiGHS
```

As explained, the data used to train the model is very limited, composed of only two samples of energy demand. Values of one are used as input data, without adding any real additional information to the model. Both `X` and `Y` values are transformed to `Float32` type to match Flux parameters.

```julia
X = reshape([1 1], (2, 1)) .|> Float32
Y = reshape([10 20], (2, 1)) .|> Float32
```

Just like regular JuMP, ApplicationDrivenLearning has a `Model` function to initialize an empty model. After initializing, we can declare the policy and forecast variables.

- Policy variables represent decision variables that should be maintained from the `Plan` to the `Assess` model.
Expand Down Expand Up @@ -148,6 +141,14 @@ set_optimizer(model, HiGHS.Optimizer)
set_silent(model)
```

As explained, the data used to train the model is very limited, composed of only two samples of energy demand. Values of one are used as input data, without adding any real additional information to the model. `X` is a matrix representing input values and the dictionary `Y` maps the forecast variable `θ` to numerical values to be used. Both `X` and `Y` values are transformed to `Float32` type to match Flux parameters.

```julia
X = reshape([1 1], (2, 1)) .|> Float32
Y = Dict(θ => [10, 20] .|> Float32)
```


A simple forecast model with only one parameter can be defined as a `Flux.Dense` layer with just 1 weight and no bias. We can associate the predictive model with our ApplicationDrivenLearning model only if its output size matches the number of declared forecast variables.

```julia
Expand Down
Loading