refactor: align state modifications to new apis

MartinuzziFrancesco · MartinuzziFrancesco · commit 3f09d37bab97 · 2025-08-24T15:07:15.000+02:00
diff --git a/README.md b/README.md
@@ -27,22 +27,18 @@ Use the
 [in-development documentation](https://docs.sciml.ai/ReservoirComputing/dev/)
 to take a look at not yet released features.
 
-## Citing
+## Features
 
-If you use this library in your work, please cite:
+ReservoirComputing.jl provides layers,models, and functions to help build and train
+reservoir computing models. More specifically the software offers
 
-```bibtex
-@article{martinuzzi2022reservoircomputing,
-  author  = {Francesco Martinuzzi and Chris Rackauckas and Anas Abdelrehim and Miguel D. Mahecha and Karin Mora},
-  title   = {ReservoirComputing.jl: An Efficient and Modular Library for Reservoir Computing Models},
-  journal = {Journal of Machine Learning Research},
-  year    = {2022},
-  volume  = {23},
-  number  = {288},
-  pages   = {1--8},
-  url     = {http://jmlr.org/papers/v23/22-0611.html}
-}
-```
+- Base layers for reservoir computing model construction such as `ReservoirChain`,
+  `Readout`, `Collect`, and `ESNCell`
+- Fully built models such as `ESN`, and `DeepESN`
+- 15+ reservoir initializers and 5+ input layer initializers
+- 5+ reservoir states modification algorithms
+- Sparse matrix computation through
+  [SparseArrays.jl](https://docs.julialang.org/en/v1/stdlib/SparseArrays/)
 
 ## Installation
 
@@ -63,67 +59,87 @@ Pkg.add("ReservoirComputing")
 
 To illustrate the workflow of this library we will showcase
 how it is possible to train an ESN to learn the dynamics of the
-Lorenz system. As a first step we gather the data.
-For the `Generative` prediction we need the target data
-to be one step ahead of the training data:
+Lorenz system.
+
+### 1. Generate data
+
+As a general first step wee fix the random seed for reproducibilty
 
 ```julia
-using ReservoirComputing, OrdinaryDiffEq, Random
+using Random
 Random.seed!(42)
 rng = MersenneTwister(17)
+```
 
-#lorenz system parameters
-u0 = [1.0, 0.0, 0.0]
-tspan = (0.0, 200.0)
-p = [10.0, 28.0, 8 / 3]
+For an autoregressive prediction we need the target data
+to be one step ahead of the training data:
+
+```julia
+using OrdinaryDiffEq
 
 #define lorenz system
 function lorenz(du, u, p, t)
     du[1] = p[1] * (u[2] - u[1])
     du[2] = u[1] * (p[2] - u[3]) - u[2]
     du[3] = u[1] * u[2] - p[3] * u[3]
 end
+
 #solve and take data
-prob = ODEProblem(lorenz, u0, tspan, p)
+prob = ODEProblem(lorenz, [1.0f0, 0.0f0, 0.0f0], (0.0, 200.0), [10.0f0, 28.0f0, 8/3])
 data = Array(solve(prob, ABM54(); dt=0.02))
-
 shift = 300
 train_len = 5000
 predict_len = 1250
 
 #one step ahead for generative prediction
 input_data = data[:, shift:(shift + train_len - 1)]
 target_data = data[:, (shift + 1):(shift + train_len)]
-
 test = data[:, (shift + train_len):(shift + train_len + predict_len - 1)]
 ```
 
-Now that we have the data we can initialize the ESN with the chosen parameters.
-Given that this is a quick example we are going to change the least amount of
-possible parameters:
+### 2. Build Echo State Network
+
+We can either use the provided `ESN` or build one from scratch.
+We showcase the second option:
 
 ```julia
 input_size = 3
 res_size = 300
 esn = ReservoirChain(
-    StatefulLayer(ESNCell(input_size => res_size; init_reservoir=rand_sparse(; radius=1.2, sparsity=6/300))),
+    StatefulLayer(
+        ESNCell(
+            input_size => res_size;
+            init_reservoir=rand_sparse(; radius=1.2, sparsity=6/300)
+        )
+    ),
     NLAT2(),
-    Readout(res_size => input_size)
-) #or ESN(input_size, res_size, input_size; init_reservoir=rand_sparse(; radius=1.2, sparsity=6/300))
+    Readout(res_size => input_size) # autoregressive so out_dims == in_dims
+)
+# alternative:
+# esn = ESN(input_size, res_size, input_size;
+#     init_reservoir=rand_sparse(; radius=1.2, sparsity=6/300)
+# )
 ```
 
-The echo state network can now be trained and tested.
-If not specified, the training will always be ordinary least squares regression:
+### 3. Train the Echo State Network
+
+ReservoirCOmputing.jl builds on Lux(Core), so in order to train the model
+we first need to instantiate the parameters and the states:
 
 ```julia
 ps, st = setup(rng, esn)
 ps, st = train!(esn, input_data, target_data, ps, st)
-output, _ = predict(esn, 1250, ps, st; initialdata=test[:, 1])
 ```
 
-The data is returned as a matrix, `output` in the code above,
-that contains the predicted trajectories.
-The results can now be easily plotted:
+### 4. Predict and visualize
+
+We can now use the trained ESN to forecast the Lorenz system dynamics
+
+```julia
+output, st = predict(esn, 1250, ps, st; initialdata=test[:, 1])
+```
+
+We can now visualize the results
 
 ```julia
 using Plots
@@ -146,6 +162,23 @@ plot!(transpose(test)[:, 1], transpose(test)[:, 2], transpose(test)[:, 3]; label
 
 ![lorenz_attractor](https://user-images.githubusercontent.com/10376688/81470281-5a34b580-91ea-11ea-9eea-d2b266da19f4.png)
 
+## Citing
+
+If you use this library in your work, please cite:
+
+```bibtex
+@article{martinuzzi2022reservoircomputing,
+  author  = {Francesco Martinuzzi and Chris Rackauckas and Anas Abdelrehim and Miguel D. Mahecha and Karin Mora},
+  title   = {ReservoirComputing.jl: An Efficient and Modular Library for Reservoir Computing Models},
+  journal = {Journal of Machine Learning Research},
+  year    = {2022},
+  volume  = {23},
+  number  = {288},
+  pages   = {1--8},
+  url     = {http://jmlr.org/papers/v23/22-0611.html}
+}
+```
+
 ## Acknowledgements
 
 This project was possible thanks to initial funding through
diff --git a/docs/make.jl b/docs/make.jl
@@ -1,7 +1,7 @@
-using Documenter, DocumenterCitations, ReservoirComputing
+using Documenter, DocumenterCitations, DocumenterInterLinks, ReservoirComputing
 
-cp("./docs/Manifest.toml", "./docs/src/assets/Manifest.toml"; force=true)
-cp("./docs/Project.toml", "./docs/src/assets/Project.toml"; force=true)
+#cp("./docs/Manifest.toml", "./docs/src/assets/Manifest.toml"; force=true)
+#cp("./docs/Project.toml", "./docs/src/assets/Project.toml"; force=true)
 
 ENV["PLOTS_TEST"] = "true"
 ENV["GKSwstype"] = "100"
diff --git a/docs/pages.jl b/docs/pages.jl
@@ -17,7 +17,7 @@ pages = [
         "Layers"=>"api/layers.md",
         "Models"=>"api/models.md",
         "States"=>"api/states.md",
-        "Train"=>"api/training.md",
+        "Train"=>"api/train.md",
         "Predict"=>"api/predict.md",
         "Initializers"=>"api/inits.md",
         "ReCA"=>"api/reca.md"]    #"References" => "references.md"
diff --git a/docs/src/api/states.md b/docs/src/api/states.md
@@ -1,34 +1,18 @@
 # States Modifications
 
-## Padding and Estension
-
-```@docs
-    StandardStates
-    ExtendedStates
-    PaddedStates
-    PaddedExtendedStates
-```
-
-## Non Linear Transformations
-
 ```@docs
-    NLADefault
+    Pad
+    Extend
     NLAT1
     NLAT2
     NLAT3
     PartialSquare
     ExtendedSquare
 ```
 
-## Internals
-
-```@docs
-    ReservoirComputing.create_states
-```
-
 ## References
 
 ```@bibliography
 Pages = ["states.md"]
 Canonical = false
-```
+```
diff --git a/ext/RCMLJLinearModelsExt.jl b/ext/RCMLJLinearModelsExt.jl
@@ -3,22 +3,23 @@ using ReservoirComputing
 using MLJLinearModels
 
 function ReservoirComputing.train(regressor::MLJLinearModels.GeneralizedLinearRegression,
-        states::AbstractArray{T}, target::AbstractArray{T};
-        kwargs...) where {T <: Number}
-    out_size = size(target, 1)
-    output_layer = similar(target, size(target, 1), size(states, 1))
+    states::AbstractMatrix{<:Real}, target::AbstractMatrix{<:Real};
+    kwargs...)
+    @assert size(states, 2) == size(target, 2) "states and target must share the same number of columns."
 
     if regressor.fit_intercept
-        throw(ArgumentError("fit_intercept=true is not yet supported.
-            Please add fit_intercept=false to the MLJ regressor"))
+        throw(ArgumentError("fit_intercept=true not supported here. \
+        Either set fit_intercept=false on the MLJ regressor, or extend addreadout! to write bias."))
     end
-
-    for i in axes(target, 1)
-        output_layer[i, :] = MLJLinearModels.fit(regressor, states',
-            target[i, :]; kwargs...)
+    permuted_states = permutedims(states)
+    output_matrix = similar(target, size(target, 1), size(states, 1))
+    for idx in axes(target, 1)
+        yi = vec(target[idx, :])
+        coefs = MLJLinearModels.fit(regressor, permuted_states, yi; kwargs...)
+        output_matrix[idx, :] = coefs
     end
 
-    return OutputLayer(regressor, output_layer, out_size, target[:, end])
+    return output_matrix
 end
 
 end #module
diff --git a/src/ReservoirComputing.jl b/src/ReservoirComputing.jl
@@ -46,8 +46,7 @@ include("extensions/reca.jl")
 
 export ESNCell, StatefulLayer, Readout, ReservoirChain, Collect, collectstates, train!, predict
 
-export NLADefault, NLAT1, NLAT2, NLAT3, PartialSquare, ExtendedSquare
-export StandardStates, ExtendedStates, PaddedStates, PaddedExtendedStates
+export Pad, Extend, NLAT1, NLAT2, NLAT3, PartialSquare, ExtendedSquare
 export StandardRidge
 export chebyshev_mapping, informed_init, logistic_mapping, minimal_init,
     modified_lm, scaled_rand, weighted_init, weighted_minimal
diff --git a/src/extensions/reca.jl b/src/extensions/reca.jl
@@ -14,12 +14,12 @@ The detail of this implementation can be found in [1].
 [1] Nichele, Stefano, and Andreas Molund. “Deep reservoir computing using cellular
 automata.” arXiv preprint arXiv:1703.02806 (2017).
 """
-struct RandomMapping{I, T} <: AbstractInputEncoding
+struct RandomMapping{I,T} <: AbstractInputEncoding
     permutations::I
     expansion_size::T
 end
 
-struct RandomMaps{T, E, G, M, S} <: AbstractEncodingData
+struct RandomMaps{T,E,G,M,S} <: AbstractEncodingData
     permutations::T
     expansion_size::E
     generations::G
@@ -44,12 +44,12 @@ arXiv preprint arXiv:1410.0162 (2014).
 [2] Nichele, Stefano, and Andreas Molund. “_Deep reservoir computing using cellular
 automata._” arXiv preprint arXiv:1703.02806 (2017).
 """
-struct RECA{S, R, E, T, Q} <: AbstractReca
+struct RECA{S,R,E,N,T,Q} <: AbstractReca
     #res_size::I
     train_data::S
     automata::R
     input_encoding::E
-    nla_type::ReservoirComputing.NonLinearAlgorithm
+    nla_type::N
     states::T
     states_type::Q
 end
diff --git a/src/models/hybridesn.jl b/src/models/hybridesn.jl
@@ -94,19 +94,14 @@ traditional Echo State Networks with a predefined knowledge model [^Pathak2018].
 function HybridESN(model::KnowledgeModel, train_data::AbstractArray,
     in_size::Int, res_size::Int; input_layer=scaled_rand, reservoir=rand_sparse,
     bias=zeros32, reservoir_driver=RNN(),
-    nla_type::NonLinearAlgorithm=NLADefault(),
-    states_type::AbstractStates=StandardStates(), washout::Int=0,
+    nla_type=NLADefault(),
+    states_type=StandardStates(), washout::Int=0,
     rng::AbstractRNG=Utils.default_rng(), T=Float32,
     matrix_type=typeof(train_data))
     train_data = vcat(train_data, model.model_data[:, 1:(end-1)])
 
-    if states_type isa AbstractPaddedStates
-        in_size = size(train_data, 1) + 1
-        train_data = vcat(adapt(matrix_type, ones(1, size(train_data, 2))),
-            train_data)
-    else
-        in_size = size(train_data, 1)
-    end
+    in_size = size(train_data, 1)
+
 
     reservoir_matrix = reservoir(rng, T, res_size, res_size)
     #different from ESN, why?
diff --git a/src/states.jl b/src/states.jl
diff --git a/src/train.jl b/src/train.jl