LAMPSPUC · Giovanni3A · Jan 30, 2026 · Jan 28, 2026 · Jan 28, 2026 · Jan 28, 2026
diff --git a/Project.toml b/Project.toml
@@ -1,12 +1,13 @@
 name = "ApplicationDrivenLearning"
 uuid = "0856f1c8-ef17-4e14-9230-2773e47a789e"
 authors = ["Giovanni Amorim", "Joaquim Garcia"]
-version = "0.1.5"
+version = "0.1.6"
 
 [deps]
 BilevelJuMP = "485130c0-026e-11ea-0f1a-6992cd14145c"
 DiffOpt = "930fe3bc-9c6b-11ea-2d94-6184641e85e7"
 Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
+Functors = "d9f16b24-f501-4c13-a1f2-28368ffc5196"
 JobQueueMPI = "32d208e1-246e-420c-b6ff-18b71b410923"
 JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
 MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
@@ -20,6 +21,7 @@ Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
 BilevelJuMP = "0.6.2"
 DiffOpt = "0.5.0"
 Flux = "0.16.3"
+Functors = "0.5.2"
 JobQueueMPI = "0.1.1"
 JuMP = "1.24"
 MPI = "0.20.22"

diff --git a/docs/src/examples/newsvendor.md b/docs/src/examples/newsvendor.md
@@ -31,7 +31,7 @@ i=2 \longrightarrow c=10; \quad q = 11;\quad r = 1
 
 The demand series for both items will be generated using a discrete uniform distribution.
 
-Let's start by loading the necessary packages and defining the data.
+Let's start by loading the necessary packages and defining the problem parameters.
 
 ```julia
 using Flux
@@ -47,8 +47,6 @@ Random.seed!(123)
 c = [10, 10]
 q = [19, 11]
 r = [9, 1]
-y_d = rand(10:100, (100, 2)) .|> Float32
-x_d = ones(100, 1) .|> Float32
 ```
 
 Now, we can initialize the application driven learning model, build the plan and assess models and set the forecast model.
@@ -91,11 +89,21 @@ pred = Flux.Dense(1 => 2, exp)
 ADL.set_forecast_model(model, pred)
 ```
 
+Then, we can initialize the data, referencing forecast variables.
+
+```julia
+x_d = ones(100, 1) .|> Float32
+y_d = Dict(
+    d[1] => rand(10:100, 100) .|> Float32,
+    d[2] => rand(10:100, 100) .|> Float32
+)
+```
+
 We can check how the model performs by computing the assess cost with the initial (random) forecast model.
 
 ```julia
 julia> ADL.compute_cost(model, x_d, y_d)
--5.118482679128647
+-3.571615f0
 ```
 
 Now let's train the model using the GradientMode.
@@ -109,49 +117,49 @@ julia> gd_sol = ApplicationDrivenLearning.train!(
         epochs=30
     )
 )
-Epoch 1 | Time = 0.5s | Cost = -5.12
-Epoch 2 | Time = 1.0s | Cost = -6.25
-Epoch 3 | Time = 1.5s | Cost = -7.64
-Epoch 4 | Time = 2.1s | Cost = -9.33
-Epoch 5 | Time = 2.6s | Cost = -11.4
-Epoch 6 | Time = 3.1s | Cost = -13.93
-Epoch 7 | Time = 3.6s | Cost = -17.02
-Epoch 8 | Time = 4.1s | Cost = -20.82
-Epoch 9 | Time = 4.7s | Cost = -25.17
-Epoch 10 | Time = 5.2s | Cost = -29.51
-Epoch 11 | Time = 5.7s | Cost = -33.42
-Epoch 12 | Time = 6.3s | Cost = -37.56
-Epoch 13 | Time = 6.8s | Cost = -42.92
-Epoch 14 | Time = 7.5s | Cost = -50.45
-Epoch 15 | Time = 8.2s | Cost = -60.3
-Epoch 16 | Time = 8.9s | Cost = -72.4
-Epoch 17 | Time = 9.5s | Cost = -87.18
-Epoch 18 | Time = 10.2s | Cost = -105.31
-Epoch 19 | Time = 10.8s | Cost = -127.53
-Epoch 20 | Time = 11.4s | Cost = -154.36
-Epoch 21 | Time = 12.1s | Cost = -185.68
-Epoch 22 | Time = 12.8s | Cost = -222.14
-Epoch 23 | Time = 13.4s | Cost = -265.19
-Epoch 24 | Time = 14.0s | Cost = -315.45
-Epoch 25 | Time = 14.5s | Cost = -370.01
-Epoch 26 | Time = 15.1s | Cost = -425.62
-Epoch 27 | Time = 15.7s | Cost = -464.52
-Epoch 28 | Time = 16.3s | Cost = -461.25
-Epoch 29 | Time = 16.9s | Cost = -439.36
-Epoch 30 | Time = 17.5s | Cost = -419.52
-ApplicationDrivenLearning.Solution(-464.5160680770874, Real[1.6317965f0, 1.7067692f0, 2.7623773f0, 0.9124785f0])
+Epoch 1 | Time = 0.4s | Cost = -3.57
+Epoch 2 | Time = 0.8s | Cost = -4.36
+Epoch 3 | Time = 1.2s | Cost = -5.33
+Epoch 4 | Time = 1.6s | Cost = -6.51
+Epoch 5 | Time = 2.0s | Cost = -7.95
+Epoch 6 | Time = 2.4s | Cost = -9.72
+Epoch 7 | Time = 2.8s | Cost = -11.88
+Epoch 8 | Time = 3.2s | Cost = -14.53
+Epoch 9 | Time = 3.6s | Cost = -17.78
+Epoch 10 | Time = 4.0s | Cost = -21.77
+Epoch 11 | Time = 4.4s | Cost = -26.68
+Epoch 12 | Time = 4.8s | Cost = -32.73
+Epoch 13 | Time = 5.2s | Cost = -39.52
+Epoch 14 | Time = 5.6s | Cost = -46.64
+Epoch 15 | Time = 6.0s | Cost = -54.15
+Epoch 16 | Time = 6.4s | Cost = -62.95
+Epoch 17 | Time = 6.8s | Cost = -74.74
+Epoch 18 | Time = 7.2s | Cost = -90.36
+Epoch 19 | Time = 7.6s | Cost = -110.35
+Epoch 20 | Time = 8.0s | Cost = -135.12
+Epoch 21 | Time = 8.4s | Cost = -164.34
+Epoch 22 | Time = 8.8s | Cost = -197.82
+Epoch 23 | Time = 9.2s | Cost = -237.1
+Epoch 24 | Time = 9.6s | Cost = -282.57
+Epoch 25 | Time = 10.0s | Cost = -334.87
+Epoch 26 | Time = 10.4s | Cost = -389.66
+Epoch 27 | Time = 10.8s | Cost = -442.7
+Epoch 28 | Time = 11.2s | Cost = -469.92
+Epoch 29 | Time = 11.6s | Cost = -452.58
+Epoch 30 | Time = 12.0s | Cost = -430.85
+ApplicationDrivenLearning.Solution(-469.91516f0, Real[1.6040976f0, 1.2566354f0, 2.8811285f0, 1.1966338f0])
 
 julia> ADL.compute_cost(model, x_d, y_d)
--464.5160680770874
+-469.91516f0
 ```
 
 After training, we can check the cost of the solution found by the gradient mode and even analyze the predictions from it.
 
 ```julia
 julia> model.forecast(x_d[1,:])
-2-element Vector{Float32}:
- 80.977684
- 13.725394
+2-element ApplicationDrivenLearning.VariableIndexedVector{Float32}:
+ 88.69701
+ 11.626293
 ```
 
 As we can see, the forecast model overestimates the demand for the first item and underestimates the demand for the second item (both items average demand is 55), following the incentives from the model structure.
diff --git a/docs/src/examples/scheduling.md b/docs/src/examples/scheduling.md
@@ -53,8 +53,7 @@ We can check how the model performs by computing the assess cost with the initia
 
 ```julia
 X = ones(2, 1) .|> Float32
-Y = zeros(2, 1) .|> Float32
-Y[2, 1] = 2.0
+Y = Dict(y => [0.0, 2.0] .|> Float32)
 set_optimizer(model, Gurobi.Optimizer)
 set_silent(model)
 ```
@@ -121,10 +120,10 @@ Iter     Function value    √(Σ(yᵢ-ȳ)²)/n
  * time: 0.1640000343322754
 ApplicationDrivenLearning.Solution(29.99998f0, Real[0.6931467f0])
 
-julia> model.forecast(X[1,:])  # previsão final
+julia> model.forecast(X[1,:])  # final forecast
 1-element Vector{Float32}:
  1.999999
 
-julia> ADL.compute_cost(model, X, Y)  # custo final
+julia> ADL.compute_cost(model, X, Y)  # final cost
 29.99998
 ```
diff --git a/docs/src/reference.md b/docs/src/reference.md
@@ -60,7 +60,6 @@ ApplicationDrivenLearning.apply_gradient!
 ## Other functions
 
 ```@docs
-forecast
 compute_cost
 train!
 ApplicationDrivenLearning.build_plan_model_forecast_params

diff --git a/docs/src/tutorials/custom_forecast.md b/docs/src/tutorials/custom_forecast.md
@@ -4,53 +4,74 @@ The basic approach to define a forecast model is to use a `Chain` from the `Flux
 
 ## Input-Output mapping
 
-The connection between predictive model outputs and plan model inputs is not always a straightforward one. Because of this, the `set_forecast_model` function, used to define the predictive model in the ApplicationDrivenLearning.jl package, includes the `input_output_map` parameter.
-This parameter allows users to declare an explicit mapping between the outputs produced by Flux models and the forecast variables used in the planning model. This is useful in contexts where the same prediction logic can be applied across several entities (such as production units or geographical locations), promoting model reuse and computational efficiency.
+The connection between predictive model outputs and plan model inputs is not always a straightforward one. Because of this, the definition of a `PredictiveModel` structure, used to define the predictive model in the ApplicationDrivenLearning.jl package, includes the `input_output_map` parameter.
+This parameter allows users to declare an explicit mapping between the outputs produced by Flux models and the forecast variables used in the planning model. This is useful in contexts where the same prediction logic can be applied across several entities (such as production units or geographical locations), promoting model reuse and computational and parameter efficiency.
 
-Consider a scenario where the input dataset contains 3 predictive variables (for example expected temperature on location 1, expected temperature on location 2 and weekday), there are 2 forecast variables (energy demand on the two locations of interest) and the forecast model should use only the expected temperature of a location to predict it’s demand. That means we would make two predictions using the same model and concatenate those values. This can be easily achieved with a dictionary mapping the data input and forecast variable indexes.
+Consider a scenario where the input dataset contains 3 variables (for example expected temperature on location 1, expected temperature on location 2 and weekday), there are 2 forecast variables (energy demand on the two locations of interest) and the forecast model should use only the expected temperature of a location to predict it’s demand. That means we would make two predictions using the same model and concatenate those values. This can be easily achieved with a dictionary mapping the data input and forecast variable indexes.
 
 ```julia
+model = ApplicationDrivenLearning.Model()
+@variable(model, demand[1:2], ApplicationDrivenLearning.Forecast)
+
 X = [
     76 89 2;
     72 85 3
-] # input dataset of size 2 by 3
+] .|> Float32 # input dataset of size 2 by 3
+Y = Dict(
+    demand[1] => [101, 89] .|> Float32,
+    demand[2] => [68, 49] .|> Float32
+)
 dem_forecast = Dense(
     2 => 1
 ) # forecast model takes 2 inputs and outputs single value
 
 input_output_map = Dict(
-    [1, 3] => [1], # input indexes 1 and 3 map to 1st forecast variable
-    [2, 3] => [2] # input indexes 2 and 3 map to 2nd forecast variable
+    [1, 3] => [demand[1]], # input indexes 1 and 3 map to 1st forecast variable
+    [2, 3] => [demand[2]] # input indexes 2 and 3 map to 2nd forecast variable
 )
-ApplicationDrivenLearning.set_forecast_model(model, dem_forecast, input_output_map)
+predictive = PredictiveModel(dem_forecast, input_output_map)
+ApplicationDrivenLearning.set_forecast_model(model, predictive)
 ```
 
 ## Multiple Flux models
 
 The definition of the predictive model can also be done using multiple Flux models. This supports the modular construction of predictive architectures, where specialized components are trained to forecast different aspects of the problem, without the difficulty of defining custom architectures.
 
-This can be achieved providing an array of model objects and an array of dictionaries as input-output mapping to the `set_forecast_model` function. Using the context from previous example, let’s assume there is an additional variable that has to be predicted to each location but not variable on time (that is, on dataset samples). This can be achieved defining an additional model that maps a constant input value to the correct output indexes.
+This can be achieved providing an array of model objects and an array of dictionaries as input-output mapping to the `PredictiveModel` construction. Using the context from previous example, let’s assume we also want to predict price for each location using a single model that receives average lagged price. This can be achieved defining an additional model and mapping.
 
 ```julia
+model = ApplicationDrivenLearning.Model()
+@variables(model, begin
+    demand[1:2], ApplicationDrivenLearning.Forecast
+    price[1:2], ApplicationDrivenLearning.Forecast
+end)
+
 X = [
-    76 89 2 1;
-    72 85 3 1
-] # input dataset of size 2 by 4
+    76 89 2 103;
+    72 85 3 89
+] .|> Float32  # input dataset of size 2 by 4
+Y = Dict(
+    demand[1] => [101, 89] .|> Float32,
+    demand[2] => [68, 49] .|> Float32,
+    price[1] => [101, 89] .|> Float32,
+    price[2] => [68, 49] .|> Float32,
+)
 dem_forecast = Dense(
     2 => 1
 ) # demand forecast model takes 2 inputs and outputs single value
-aux_forecast = Dense(
+prc_forecast = Dense(
     1 => 2
-) # auxiliar forecast model takes 1 input and outputs 2 values
-forecast_objs = [dem_forecast, aux_forecast]
+) # price forecast model takes 1 input and outputs 2 values
+forecast_objs = [dem_forecast, prc_forecast]
 input_output_map = [
     Dict(
-        [1, 3] => [1],
-        [2, 3] => [2]
-    ), # input indexes 1,2,3 are used to compute forecast vars 1,2 with 1st Flux.Dense object
+        [1, 3] => [demand[1]],
+        [2, 3] => [demand[2]]
+    ), # input indexes 1,2,3 are used to compute demand forecast vars separately with 1st Flux.Dense object
     Dict(
-        [4] => [3, 4]
-    ), # input index 4 is used to compute forecast vars 3,4 with 2nd Flux.Dense object
+        [4] => price
+    ), # input index 4 is used to compute both price forecast vars with 2nd Flux.Dense object
 ]
-ApplicationDrivenLearning.set_forecast_model(model, forecast_objs, input_output_map)
+predictive = PredictiveModel(forecast_objs, input_output_map)
+ApplicationDrivenLearning.set_forecast_model(model, predictive)
 ```
diff --git a/docs/src/tutorials/getting_started.md b/docs/src/tutorials/getting_started.md
@@ -16,10 +16,6 @@ using Flux
 import HiGHS
 using ApplicationDrivenLearning
 
-# data
-X = reshape([1 1], (2, 1)) .|> Float32
-Y = reshape([10 20], (2, 1)) .|> Float32
-
 # main model and policy / forecast variables
 model = ApplicationDrivenLearning.Model()
 @variables(model, begin
@@ -53,6 +49,10 @@ end)
 set_optimizer(model, HiGHS.Optimizer)
 set_silent(model)
 
+# data
+X = reshape([1 1], (2, 1)) .|> Float32
+Y = Dict(θ => [10, 20] .|> Float32)
+
 # forecast model
 nn = Chain(Dense(1 => 1; bias=false))
 ApplicationDrivenLearning.set_forecast_model(model, nn)
@@ -89,13 +89,6 @@ We have to include a solver for solving the optimization models. In this case, w
 using HiGHS
 ```
 
-As explained, the data used to train the model is very limited, composed of only two samples of energy demand. Values of one are used as input data, without adding any real additional information to the model. Both `X` and `Y` values are transformed to `Float32` type to match Flux parameters.
-
-```julia
-X = reshape([1 1], (2, 1)) .|> Float32
-Y = reshape([10 20], (2, 1)) .|> Float32
-```
-
 Just like regular JuMP, ApplicationDrivenLearning has a `Model` function to initialize an empty model. After initializing, we can declare the policy and forecast variables.
 
 - Policy variables represent decision variables that should be maintained from the `Plan` to the `Assess` model.
@@ -148,6 +141,14 @@ set_optimizer(model, HiGHS.Optimizer)
 set_silent(model)
 ```
 
+As explained, the data used to train the model is very limited, composed of only two samples of energy demand. Values of one are used as input data, without adding any real additional information to the model. `X` is a matrix representing input values and the dictionary `Y` maps the forecast variable `θ` to numerical values to be used. Both `X` and `Y` values are transformed to `Float32` type to match Flux parameters.
+
+```julia
+X = reshape([1 1], (2, 1)) .|> Float32
+Y = Dict(θ => [10, 20] .|> Float32)
+```
+
+
 A simple forecast model with only one parameter can be defined as a `Flux.Dense` layer with just 1 weight and no bias. We can associate the predictive model with our ApplicationDrivenLearning model only if its output size matches the number of declared forecast variables.
 
 ```julia