Add table of contenet

Denis Jelovina · Denis Jelovina · commit 19cbcc576484 · 2025-07-22T16:11:11.000+02:00
diff --git a/ALP_Transition_Path_Tutorial.tex b/ALP_Transition_Path_Tutorial.tex
@@ -1,15 +1,14 @@
+\section{Introduction to ALP and Transition Paths}\label{sec:intro}
 
-\section*{Introduction to ALP and Transition Paths}
+\textbf{Algebraic Programming (ALP)} is a C++ framework for high-performance linear algebra that can auto-parallelize and auto-optimize your code. A key feature of ALP is its transition path APIs, which let you use ALP through standard interfaces without changing your existing code. In practice, ALP generates drop-in replacements for established linear algebra APIs. You simply re-link your program against ALP's libraries to get optimized performance (no code modifications needed). ALP v0.8 provides transition-path libraries for several standards, including the NIST Sparse BLAS and a CRS-based iterative solver interface (ALP/Solver). This means you can take an existing C/C++ program that uses a supported API and benefit from ALP's optimizations (such as vectorization and parallelism) just by linking with ALP's libraries [1].
 
-\textbf{Algebraic Programming (ALP)} is a C++ framework for high-performance linear algebra that can auto-parallelize and auto-optimize your code. A key feature of ALP is its transition path APIs, which let you use ALP through standard interfaces without changing your existing code. In practice, ALP generates drop-in replacements for established linear algebra APIs. You simply re-link your program against ALP’s libraries to get optimized performance (no code modifications needed). ALP v0.8 provides transition-path libraries for several standards, including the NIST Sparse BLAS and a CRS-based iterative solver interface (ALP/Solver). This means you can take an existing C/C++ program that uses a supported API and benefit from ALP’s optimizations (such as vectorization and parallelism) just by linking with ALP’s libraries [1].
+One of these transition paths, and the focus of this tutorial, is ALP's \textbf{sparse Conjugate Gradient (CG) solver}. This CG solver accepts matrices in Compressed Row Storage (CRS) format (also known as CSR) and solves $Ax=b$ using an iterative method. Under the hood it leverages ALP's non-blocking execution model, which overlaps computations and memory operations for efficiency. From the user's perspective, however, the solver is accessed via a simple C-style API that feels synchronous. In this workshop, we'll learn how to use this CG solver interface step by step: from setting up ALP, to coding a solution for a small linear system, to building and running the program.
 
-One of these transition paths, and the focus of this tutorial, is ALP’s \textbf{sparse Conjugate Gradient (CG) solver}. This CG solver accepts matrices in Compressed Row Storage (CRS) format (also known as CSR) and solves $Ax=b$ using an iterative method. Under the hood it leverages ALP’s non-blocking execution model, which overlaps computations and memory operations for efficiency. From the user’s perspective, however, the solver is accessed via a simple C-style API that feels synchronous. In this workshop, we’ll learn how to use this CG solver interface step by step: from setting up ALP, to coding a solution for a small linear system, to building and running the program.
+\section{Setup: Installing ALP and Preparing to Use the Solver}\label{sec:setup}
 
-
-\section*{Setup: Installing ALP and Preparing to Use the Solver}
 This section explains how to install ALP on a Linux system and compile a simple example. ALP (Algebraic Programming) provides a C++17 library implementing the GraphBLAS interface for linear-algebra-based computations.
 
-\subsection*{Installation on Linux}
+\subsection{Installation on Linux}
 
 \begin{enumerate}
 \item Install prerequisites: Ensure you have a C++11 compatible compiler (e.g. \texttt{g++} 4.8.2 or later) with OpenMP support, CMake (>= 3.13) and GNU Make, plus development headers for libNUMA and POSIX threads. 
@@ -39,7 +38,7 @@ \subsection*{Installation on Linux}
 \begin{lstlisting}[language=bash]
 $ source ../install/bin/setenv
 \end{lstlisting}
-This script updates paths to make ALP’s compiler wrapper and libraries available.
+This script updates paths to make ALP's compiler wrapper and libraries available.
 
 \item Compile an example: ALP provides a compiler wrapper \texttt{grbcxx} to compile programs that use the ALP/GraphBLAS API. This wrapper automatically adds the correct include paths and links against the ALP library and its dependencies. For example, to compile the provided sp.cpp sample:
 \begin{lstlisting}[language=bash]
@@ -54,10 +53,10 @@ \subsection*{Installation on Linux}
 (The \texttt{grbrun} tool is more relevant when using distributed backends or controlling the execution environment; for basic usage, the program can also be run directly.)
 \end{enumerate}
 
-You can also specify a backend with the -b flag. For instance, -b reference builds a sequential version, while -b reference\_omp enables ALP’s shared-memory (OpenMP) parallel backend . If you built ALP with distributed-memory support, you might use -b hybrid or -b bsp1d for hybrid or MPI- nstyle backends. In those cases, you would run the program via grbrun (which handles launching multiple processes) – but for this tutorial, we will use a single-process, multi-threaded backend, so running the program normally is fine.
+You can also specify a backend with the -b flag. For instance, -b reference builds a sequential version, while -b reference\_omp enables ALP's shared-memory (OpenMP) parallel backend . If you built ALP with distributed-memory support, you might use -b hybrid or -b bsp1d for hybrid or MPI- nstyle backends. In those cases, you would run the program via grbrun (which handles launching multiple processes) – but for this tutorial, we will use a single-process, multi-threaded backend, so running the program normally is fine.
 \\
 
-\textbf{Direct linking option}: If you prefer to compile with your usual compiler, you need to include ALP’s headers and link against the ALP libraries manually. For the CG solver transition path, that typically means linking against the sparse solver library (e.g. libspsolver\_shmem\_parallel for the parallel version) and any core ALP libraries it depends on. For example, if ALP is installed in /opt/alp , you might compile with:
+\textbf{Direct linking option}: If you prefer to compile with your usual compiler, you need to include ALP's headers and link against the ALP libraries manually. For the CG solver transition path, that typically means linking against the sparse solver library (e.g. libspsolver\_shmem\_parallel for the parallel version) and any core ALP libraries it depends on. For example, if ALP is installed in /opt/alp , you might compile with:
 
 \begin{lstlisting}[language=bash]
 gcc -I/opt/alp/include -L/opt/alp/lib \
@@ -66,27 +65,29 @@ \subsection*{Installation on Linux}
 \end{lstlisting}
 
 
-(ALP’s documentation provides details on which libraries to link for each backend [3].) Using grbcxx is recommended for simplicity, but it’s good to know what happens under the hood. Now that our environment is set up, let’s look at the CG solver API.
+(ALP's documentation provides details on which libraries to link for each backend [3].) Using grbcxx is recommended for simplicity, but it's good to know what happens under the hood. Now that our environment is set up, let's look at the CG solver API.
 
 
-\section*{Overview of ALP’s Non-Blocking Sparse CG API}
+\section{Overview of ALP's Non-Blocking Sparse CG API}\label{sec:api}
 
-The ALP/Solver transition path provides a C-style interface for initializing and running a Conjugate Gradient solver. All functions are exposed via a header (e.g. solver.h in ALP’s include directory) and use simple types like pointers and handles. The main functions in this API are:
+The ALP/Solver transition path provides a C-style interface for initializing and running a Conjugate Gradient solver. All functions are exposed via a header (e.g. solver.h in ALP's include directory) and use simple types like pointers and handles. The main functions in this API are:
 
 \begin{itemize}
 \item \textbf{sparse\_cg\_init(\&handle, n, vals, cols, offs):} Initializes a CG solver instance. It allocates/assigns a solver context to the user-provided sparse\_cg\_handle\_t (an opaque handle type defined by ALP). The matrix $A$ is provided in CRS format by three arrays: vals (the nonzero values), cols (the column indices for each value), and offs (offsets in the vals array where each row begins). The dimension n (number of rows, which should equal number of columns for $A$) is also given. After this call, the handle represents an internal solver state with matrix $A$ stored. Return: Typically returns 0 on success (and a non-zero error code on failure) [4].
 
-\item \textbf{sparse\_cg\_set\_preconditioner(handle, func, data):} (Optional) Sets a preconditioner for the iterative solve. The user provides a function func that applies the preconditioner $M^{-1}$ to a vector (i.e. solves $Mz = r$ for a given residual $r$), along with a user data pointer. ALP will call this func(z, r, data) in each CG iteration to precondition the residual. If you don’t call this, the solver will default to no preconditioning (i.e. $M = I$). You can use this to plug in simple preconditioners (like Jacobi, with data holding the diagonal of $A$) or even advanced ones, without modifying the solver code. Return: 0 on success, or error code if the handle is invalid, etc.
+\item \textbf{sparse\_cg\_set\_preconditioner(handle, func, data):} (Optional) Sets a preconditioner for the iterative solve. The user provides a function func that applies the preconditioner $M^{-1}$ to a vector (i.e. solves $Mz = r$ for a given residual $r$), along with a user data pointer. ALP will call this func(z, r, data) in each CG iteration to precondition the residual. If you don't call this, the solver will default to no preconditioning (i.e. $M = I$). You can use this to plug in simple preconditioners (like Jacobi, with data holding the diagonal of $A$) or even advanced ones, without modifying the solver code. Return: 0 on success, or error code if the handle is invalid, etc.
+
+\item \textbf{sparse\_cg\_solve(handle, x, b):} Runs the CG iteration to solve $Ax = b$. Here b is the right-hand side vector (input), and x is the solution vector (output). You should allocate both of these arrays of length n beforehand. The solver will iterate until it converges to a solution within some default tolerance or until it reaches an iteration limit. On input, you may put an initial guess in x. If not, it's safe to initialize x to zero (the solver will start from $x_0 = 0$ by default in that case). Upon return, x will contain the approximate solution. Return: 0 if the solution converged (or still 0 if it ran the maximum iterations – specific error codes might indicate divergence or other issues in future versions).
 
-\item \textbf{sparse\_cg\_solve(handle, x, b):} Runs the CG iteration to solve $Ax = b$. Here b is the right-hand side vector (input), and x is the solution vector (output). You should allocate both of these arrays of length n beforehand. The solver will iterate until it converges to a solution within some default tolerance or until it reaches an iteration limit. On input, you may put an initial guess in x. If not, it’s safe to initialize x to zero (the solver will start from $x_0 = 0$ by default in that case). Upon return, x will contain the approximate solution. Return: 0 if the solution converged (or still 0 if it ran the maximum iterations – specific error codes might indicate divergence or other issues in future versions).
+\end{itemize}
 
 \item \textbf{sparse\_cg\_destroy(handle):} Destroys the solver instance and releases any resources associated with the given handle. After this, the handle becomes invalid. Always call this when you are done solving to avoid memory leaks. Return: 0 on success (and the handle pointer may be set to NULL or invalid after). This API is non-blocking in the sense that internally ALP may overlap operations (like sparse matrix-vector multiplications and vector updates) and use asynchronous execution for performance. However, the above functions themselves appear synchronous. For example, sparse\_cg\_solve will only return after the solve is complete (there’s no separate “wait” call exposed in this C interface). The benefit of ALP’s approach is that you, the developer, don’t need to manage threads or message passing at all ALP’s GraphBLAS engine handles parallelism behind the scenes. You just call these routines as you would any standard library. Now, let’s put these functions into practice with a concrete example.
 \end{itemize}
 
 This API is non-blocking in the sense that internally ALP may overlap operations (like sparse matrix-vector multiplications and vector updates) and use asynchronous execution for performance. However, the above functions themselves appear synchronous. For example, sparse\_cg\_solve will only return after the solve is complete (there’s no separate “wait” call exposed in this C interface). The benefit of ALP’s approach is that you, the developer, don’t need to manage threads or message passing at all. ALP’s GraphBLAS engine handles parallelism behind the scenes. You just call these routines as you would any standard library. Now, let’s put these functions into practice with a concrete example.
 
   
-\section*{Example: Solving a Linear System with ALP’s CG Solver}
+\section{Example: Solving a Linear System with ALP’s CG Solver}
 
 Suppose we want to solve a small system $Ax = b$ to familiarize ourselves with the CG interface. We will use the following $3 \times 3$ symmetric positive-definite matrix $A$: $$ A = \begin{pmatrix} 4 & 1 & 0\\ 
         1 & 3 & -1\\ 
@@ -192,7 +193,7 @@ \section*{Example: Solving a Linear System with ALP’s CG Solver}
 
 
 
-\section*{Building and Running the Example}
+\section{Building and Running the Example}
 To compile the above code with ALP, we will use the direct linking option as discussed. 
 \begin{lstlisting}[language=bash]
 g++ example.cpp -o cg_demo \
diff --git a/ALP_Tutorial.tex b/ALP_Tutorial.tex
@@ -1,8 +1,8 @@
-\section{Quick Start}
+\section{Quick Start}\label{sec:quick_start}
 
 This section explains how to install ALP on a Linux system and compile a simple example. ALP (Algebraic Programming) provides a C++17 library implementing the GraphBLAS interface for linear-algebra-based computations. To get started quickly:
 
-\subsection*{Installation on Linux}
+\subsection{Installation on Linux}
 
 \begin{enumerate}
 \item Install prerequisites: Ensure you have a C++11 compatible compiler (e.g. \texttt{g++} 4.8.2 or later) with OpenMP support, CMake (>= 3.13) and GNU Make, plus development headers for libNUMA and POSIX threads. 
@@ -46,7 +46,7 @@ \subsection*{Installation on Linux}
 \end{enumerate}
 After these steps, ALP is installed and you are ready to develop ALP-based programs. In the next sections we introduce core ALP concepts and walk through a simple example program.
 
-\section{Introduction to ALP Concepts}
+\section{Introduction to ALP Concepts}\label{sec:alp_concepts}
 
 ALP exposes a programming model similar to the GraphBLAS standard, using algebraic containers (vectors, matrices, etc.) and algebraic operations on those containers. This section covers the basic data structures, the algebraic structures (semirings) that define how arithmetic is done, and key primitive operations (such as matrix-vector multiply and element-wise operations).
 
@@ -63,7 +63,7 @@ \subsection{Vectors and Matrices in ALP}
 
 By default, new vectors/matrices start empty (with no stored elements). You can query properties like length or dimensions via \texttt{grb::size(vector)} for vector length, \texttt{grb::nrows(matrix)} and \texttt{grb::ncols(matrix)} for matrix dimensions, and \texttt{grb::nnz(container)} for the number of stored nonzero elements.
 
-\subsubsection*{Exercise: Allocating Vectors and Matrices in ALP}
+\subsubsection{Exercise: Allocating Vectors and Matrices in ALP}
 
 Write a C++ program that uses ALP to allocate two vectors and one matrix as follows:
 \begin{itemize}
@@ -190,7 +190,7 @@ \subsection{Primitive Operations (mxv, eWiseMul, dot, etc.)}
 
 
 
-\section{Simple Example}
+\section{Simple Example}\label{sec:simple_example}
 
 To illustrate ALP usage, let's create a simple C++ program that:
 \begin{itemize}
@@ -429,7 +429,7 @@ \section{Simple Example}
 \end{lstlisting}
 
 
-\section{Makefile and CMake Instructions}
+\section{Makefile and CMake Instructions}\label{sec:build_instructions}
 
 Finally, we provide guidance on compiling and running the above example in your own development environment. If you followed the installation steps and used \texttt{grbcxx}, compilation is straightforward. Here we outline two approaches: using the ALP wrapper scripts, and integrating ALP manually via a build system.
 
diff --git a/main.tex b/main.tex
@@ -31,10 +31,12 @@
 
 \maketitle
 
-\section*{Welcome to My GitHub Pages LaTeX Document!}
+\section{Welcome to GitHub Pages generated from LaTeX Document!}
 
 This document was compiled from \LaTeX{} source code using GitHub Actions and deployed to GitHub Pages.
 
+\tableofcontents
+
 \input{ALP_Tutorial.tex}
 
 \input{ALP_Transition_Path_Tutorial.tex}