You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: main_isc.tex
+33-37Lines changed: 33 additions & 37 deletions
Original file line number
Diff line number
Diff line change
@@ -183,10 +183,10 @@ \section{Background and limitations}
183
183
Other frameworks, such as ACADO~\cite{houska2011acado} and \texttt{InfiniteOpt.jl}~\cite{pulsipher2022unifying}, which cleverly leverage the modeling power of JuMP~\cite{dunning2017jump}, also follow the same CPU-centric paradigm.
184
184
%
185
185
This CPU focus limits scalability and real-time performance for large or time-critical problems that could benefit from GPU parallelism.
186
-
While some libraries provide GPU-accelerated components, none deliver a fully integrated, GPU-native workflow for nonlinear optimal control.
187
-
See, nonetheless, the nice attempt \cite{jeon2024} trying to combine the CasADi API with PyTorch so as to evaluate part of the generated code on GPU.
186
+
While some libraries provide GPU-accelerated components, very few deliver a fully integrated, GPU-native workflow for nonlinear optimal control.
187
+
See, nonetheless, the nice attempt \cite{jeon2024} trying to combine the CasADi API with PyTorch so as to evaluate part of the generated code on GPU; see as well the recent release of \texttt{InfiniteOpt.jl} \cite{Gondosiswanto2025advances} that follows a line similar to ours, leveraging \texttt{ExaModels.jl} SIMD modeling abilities for GPU.
188
188
%
189
-
Our work fills this gap with a GPU-first toolchain that unifies modeling, differentiation, and solver execution, addressing the challenges of solving large-scale sparse NLPs.
189
+
Our work contributes to fill this gap with a GPU-first toolchain that unifies modeling, differentiation, and solver execution, addressing the challenges of solving large-scale sparse NLPs. In contrast to other available solvers, our framework encompasses both direct transcription for optimal control problems (that is discretization into a math program, suited for optimization solvers and GPU evaluation as discussed in the rest of the paper), and indirect methods (\emph{aka.} shooting; we refer the reader to \cite{OC_jl} for further details).
190
190
191
191
\section{SIMD parallelism in direct optimal control} \label{s3}
192
192
When discretized by \emph{direct transcription}, optimal control problems (OCPs) possess an inherent structure that naturally supports SIMD parallelism.
@@ -285,7 +285,7 @@ \section{From optimal control models to SIMD abstraction}
285
285
$$ x(0) = (-1, 0),\quad x(1) = (0,0). $$
286
286
The strength of the DSL of the package \texttt{OptimalControl.jl} is to offer a syntax as close as possible to this mathematical formulation.\footnote{Note that one can actually use unicode characters to denote derivatives, integral, \emph{etc.}, making this closeness even more striking. Check \texttt{OptimalControl.jl} documentation online.} The translation of this optimal control problem so reads:
287
287
288
-
\begin{minted}{julia}
288
+
{\footnotesize\begin{minted}{julia}
289
289
ocp = @def begin
290
290
t in [0, 1], time
291
291
x in R^2, state
@@ -297,8 +297,9 @@ \section{From optimal control models to SIMD abstraction}
297
297
integral( 0.5u(t)^2 ) => min
298
298
end
299
299
\end{minted}
300
+
}
300
301
301
-
The intial and final times are fixed in this case but they could be additional unknowns (see, Appendix \ref{sa1}, where the Goddard benchmark problem is modeled with a free final time. Users can also declare additional finite-dimensional parameters (or \emph{variables}) to be optimized. Furthermore, extra constraints on the state, control, or other quantities can be imposed as needed.
302
+
The initial and final times are fixed in this case but they could be additional unknowns (see, Appendix \ref{sa1}, where the Goddard benchmark problem is modeled with a free final time. Users can also declare additional finite-dimensional parameters (or \emph{variables}) to be optimized. Furthermore, extra constraints on the state, control, or other quantities can be imposed as needed.
302
303
At this stage the crux is to seamlessly parse the abstract problem description and compile it on the fly into a discretized nonlinear optimization problem.
303
304
We achieve this by exploiting two features.
304
305
First, the DSL syntax is fully compatible with standard Julia, allowing us to use the language's built-in lexical and syntactic parsers.
@@ -320,20 +321,17 @@ \section{From optimal control models to SIMD abstraction}
320
321
Let us take a brief look at the generated code for this simple example.
321
322
The code is wrapped in a function whose parameters capture the key aspects of the transcription process: the numerical scheme (here trapezoidal), the grid size (here uniform), the backend (CPU or GPU), the initial values for variables, states, and controls (defaulting to nonzero constants across the grid), and the base precision for vectors (defaulting to 64-bit floating point):
322
323
323
-
{\small
324
-
\begin{minted}{julia}
324
+
{\footnotesize\begin{minted}{julia}
325
325
function def(; scheme=:trapeze,
326
326
grid_size=250, init=(0.1, 0.1, 0.1),
327
327
backend=CPU(), base_type=Float64)
328
328
\end{minted}
329
329
}
330
330
331
331
\noindent The state declaration is compiled into an \texttt{ExaModels.jl} variable representing a $2\times (N + 1)$ array, where $N$ is the grid size. Lower and upper bounds, plus initial values can be specified, and constraints are vectorized across grid points. Internally, the DSL uses metaprogramming to generate unique variable names and ensure proper initialization, while any syntactic or semantic errors are caught and reported at runtime.
332
-
% The state declaration is compiled into an \texttt{ExaModels.jl} variable declaration, here a $2 \times (N + 1)$ array where $N$ is the grid size. Note that lower and upper bounds can be provided at this step (with a first \verb+for+ statement to vectorize the constraint over all grid points), as well as initial values for the optimizer and base type for vector elements (here fp64).
333
-
% The whole declaration block is wrapped in a \verb+try ... catch+ statement so that syntactic (or semantic) errors can be returned to the user at runtime:
332
+
The state declaration is compiled into an \texttt{ExaModels.jl} variable declaration, here a $2\times (N + 1)$ array where $N$ is the grid size. Note that lower and upper bounds can be provided at this step (with a first \verb+for+ statement to vectorize the constraint over all grid points), as well as initial values for the optimizer and base type for vector elements (here fp64). The whole declaration block is wrapped in a \verb+try ... catch+ statement so that syntactic (or semantic) errors can be returned to the user at runtime:
334
333
335
-
{\small
336
-
\begin{minted}{julia}
334
+
{\footnotesize\begin{minted}{julia}
337
335
x = begin
338
336
local ex
339
337
try
@@ -346,10 +344,10 @@ \section{From optimal control models to SIMD abstraction}
346
344
for (var"i##275", var"j##276") =
347
345
Base.product(1:2, 0:grid_size)],
348
346
start = init[2])
349
-
catch ex
350
-
println("Line ", 2, ": ",
351
-
"(x in R^2, state)")
352
-
throw(ex)
347
+
catch ex
348
+
println("Line ", 2, ": ",
349
+
"(x in R^2, state)")
350
+
throw(ex)
353
351
end
354
352
end
355
353
\end{minted}
@@ -371,10 +369,9 @@ \section{From optimal control models to SIMD abstraction}
371
369
% \end{minted}
372
370
% }
373
371
374
-
\noindent The initial-state boundary constraint must be applied across all state dimensions. This is achieved using the \verb+for+ generator.
375
-
A runtime dimension check ensures that the specified bounds match the length of the state vector being addressed:
376
-
{\small
377
-
\begin{minted}{julia}
372
+
\noindent The initial-state boundary constraint must be applied across all state dimensions. This is achieved using the \verb+for+ generator. A runtime dimension check ensures that the specified bounds match the length of the state vector being addressed:
373
+
374
+
{\footnotesize\begin{minted}{julia}
378
375
length([-1, 0]) ==
379
376
length([-1, 0]) == length(1:2)
380
377
|| throw("wrong bound dimension")
@@ -387,8 +384,7 @@ \section{From optimal control models to SIMD abstraction}
387
384
388
385
\noindent The first equation of the ODE system is discretized using the trapezoidal scheme, and the corresponding expression (here the right hand side is just $x_2(t)$) is declared thanks to the \verb+for+ generator tailored for SIMD abstraction:
389
386
390
-
{\small
391
-
\begin{minted}{julia}
387
+
{\footnotesize\begin{minted}{julia}
392
388
constraint(var"p_ocp##266j",
393
389
((x[1, var"j##291" + 1] -
394
390
x[1, var"j##291"]) - (var"dt##268"
@@ -400,8 +396,7 @@ \section{From optimal control models to SIMD abstraction}
400
396
401
397
\noindent The same goes on for the second dimension of the ODE, and for the Lagrange integral cost as well, where the same numerical scheme (trapezoidal rule again) is employed for consistency (defining two objectives actually computes their sum):
0 commit comments