Skip to content

Commit 6e6fbcd

Browse files
committed
Finished I/O section
1 parent 50e1ffe commit 6e6fbcd

File tree

1 file changed

+51
-11
lines changed

1 file changed

+51
-11
lines changed

ALP_Tutorial.tex

Lines changed: 51 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ \section{ALP/GraphBLAS}\label{sec:alp_concepts}
102102
$
103103
\end{lstlisting}
104104

105-
\noindent \textbf{Exercise.} Double-check that you have the expected output from this example, as we will use its framework in the following exercises.
105+
\noindent \textbf{Exercise 1.} Double-check that you have the expected output from this example, as we will use its framework in the following exercises.
106106

107107
\noindent \textbf{Question.} Why is \texttt{argv[0]} not directly passed as input to \texttt{hello\_world}?
108108

@@ -120,7 +120,7 @@ \subsection{ALP/GraphBLAS Containers}
120120

121121
By default, newly instantiated vectors or matrices are empty, meaning they store no elements. You can query properties like length or dimensions via \texttt{grb::size(vector)} for vector length or \texttt{grb::nrows(matrix)} and \texttt{grb::ncols(matrix)} for matrix dimensions. The number of elements present within a container may be retrieved via \texttt{grb::nnz(container)}. Containers have a maximum capacity on the number of elements they may store; the capacity may be retrieved via \texttt{grb::capacity(container)} and on construction of a container is set to the maximum of its dimensions. For example, the initial capacity of \texttt{x} in the above is $100\ 000$, while that of \texttt{A} is $150\ 000$. The size of a container once initialised is fixed, while the capacity may increase during the lifetime of a container.
122122

123-
\noindent \textbf{Exercise.} Allocate vectors and matrices in ALP as follows:
123+
\noindent \textbf{Exercise 2.} Allocate vectors and matrices in ALP as follows:
124124
\begin{itemize}
125125
\item A \texttt{grb::Vector<double>} \texttt{x} of length 100, with initial capacity 100.
126126
\item A \texttt{grb::Vector<double>} \texttt{y} of length 1\ 000, with initial capacity 200.
@@ -147,10 +147,10 @@ \subsection{Basic Container I/O}
147147
Similarly, \texttt{grb::set(vector,scalar)} sets all elements of a given vector equal to the given scalar, resulting in a full (dense) vector.
148148
By contrast, \texttt{grb::setElement(vector,scalar,index)} sets only a given element at a given index equal to a given scalar.
149149

150-
\noindent \textbf{Exercise.} Start from a copy of \texttt{alp\_hw.cpp} and modify the \texttt{hello\_world} function to allocate two vectors and a matrix as follows:
150+
\noindent \textbf{Exercise 3.} Start from a copy of \texttt{alp\_hw.cpp} and modify the \texttt{hello\_world} function to allocate two vectors and a matrix as follows:
151151
\begin{itemize}
152152
\item a \texttt{grb::Vector<bool>} \texttt{x} and \texttt{y} both of length $497$ with capacities $497$ and $1$, respectively;
153-
\item a \texttt{grb::Matrix<void>} \texttt{A} of size $497\times497$ and capacity $1\ 721$.
153+
\item a \texttt{grb::Matrix<void>} \texttt{A} of size $497\times497$ and capacity $1\ 727$.
154154
\end{itemize}
155155
Then, initialise $y$ with a single value \texttt{true} at index $200$, and initialise $x$ with \texttt{false} everywhere. Print the number of nonzeroes in $x$ and $y$. Once done, after compilation and execution, the output should be alike:
156156
\begin{lstlisting}
@@ -160,17 +160,57 @@ \subsection{Basic Container I/O}
160160
...
161161
\end{lstlisting}
162162

163-
% TODO describe output iterators and print the contents of $y$
163+
\noindent \textbf{Bonus question.} Print the capacity of $y$. Should the value returned be unexpected, considering the specification in the user documentation, is this a bug in ALP?
164164

165-
% TODO use output iterators to double-check $x$ has $497$ values and that all those values equal \texttt{false}
166-
167-
% TODO use input iterators to build A from west0497.mtx. Have it print the number of nonzeroes after buildMatrixUnique.
165+
ALP/GraphBLAS containers are compatible with standard STL output iterators. For example, the following for-loop prints all entries of $y$:
166+
\begin{lstlisting}
167+
for( const auto &pair : y ) {
168+
std::cout << "y[ " << pair.first << " ] = " << pair.second << "\n";
169+
}
170+
\end{lstlisting}
168171

169-
% TODO print the capacity of $y$.
172+
\noindent \textbf{Exercise 4.} Use output iterators to double-check that $x$ has $497$ values and that all those values equal \texttt{false}.
170173

171-
% TODO Download west0497.mtx and run the application
174+
Commonly, matrices are available in common file exchange formats, such as MatrixMarket \texttt{.mtx}. To facilitate working with standard files, ALP contains utilities for reading standard format. The utilities are not included with \texttt{graphblas.hpp} and must instead be included explicitly:
175+
\begin{lstlisting}
176+
#include <graphblas/utils/parser.hpp>
177+
\end{lstlisting}
178+
Including the above parser utility defines the \texttt{MatrixFileReader} class. Its constructor takes one filename plus a Boolean that describes whether vertex are numbered consecutively (as required in the case of MatrixMarket files); some graph repositories, e.g. SNAP, have non consecutively-numbered vertices which could be an artifact of how the data is constructed or due to post-processing. In this case, passing \texttt{false} as the second argument to the parser will map the non-consecutive vertex IDs to a consecutive range instead, thus packing the graph structure in a minimally-sized sparse matrix. In this tutorial, however, we stick to MatrixMarket files and therefore always pass \texttt{true}:
179+
\begin{lstlisting}
180+
grb::utils::MatrixFileReader< double > parser( in, true );
181+
\end{lstlisting}
182+
After instantiation, the parser defines STL-compatible iterators that are enriched for use with sparse matrices; e.g., one may issue
183+
\begin{lstlisting}
184+
const auto iterator = parser.begin();
185+
std::cout << "First parsed entry: ( " << iterator.i() << ", " << iterator.j() << " ) = " << iterator.v() << "\n";
186+
\end{lstlisting}
187+
which should print, on execution,
188+
\begin{lstlisting}
189+
First parsed entry: ( 495, 496 ) = 0.897354
190+
\end{lstlisting}
191+
Note that the template argument to \texttt{MatrixFileReader} defines the value type of the sparse matrix nonzero values. The start-end iterator pair from this parser is compatible with the \texttt{grb::buildMatrixUnique} ALP/GraphBLAS primitive, where the suffix -unique indicates that the iterator pair should never iterate over a nonzero at the same matrix position $(i,j)$ more than once. Hence reading in the matrix into the ALP/GraphBLAS container $A$ proceeds simply as
192+
\begin{lstlisting}
193+
grb::RC rc = grb::buildMatrixUnique(
194+
A,
195+
parser.begin( grb::SEQUENTIAL ), parser.end( grb::SEQUENTIAL ),
196+
grb::SEQUENTIAL
197+
);
198+
assert( rc == grb::SUCCESS );
199+
\end{lstlisting}
200+
The type \texttt{grb::RC} is the standard return type; ALP primitives\footnote{that are not simple `getters' like \texttt{grb::nnz}} always return an error code, and, if no error is encountered, return \texttt{grb::SUCCESS}. Iterators in ALP may be either \emph{sequential} or \emph{parallel}. Sequential iterators mean a start-end iterator pair such as retrieved from the parser in the above snippet, iterate over all elements of the underlying container (in this case, all nonzeroes in the sparse matrix file). A parallel iterator, however, only retrieves some subset of nonzeroes $V_s$, where $s$ is the process ID and there are a total of $p$ subsets $V_i$, where $p$ is the total number of processes. These subsets are pairwise disjoint, while the union over all $V_i$ corresponds to all elements in the underlying container. Parallel iterators are useful e.g.\ when launching an ALP/GraphBLAS program using multiple processes to benefit from distributed-memory parallelism; in such cases, it would be wasteful if every process iterates over all data elements on data ingestion-- instead, parallel I/O is commonly preferred. In the above snippet, the primitive for building the matrix must be aware of which type of iterator pair is given, and hence the last argument repeats that the iterators passed are, indeed, sequential iterators.
172201

173-
% TODO Bonus question: if this is >1 and looking at the user doc, why is this OK?
202+
\textbf{Exercise 5.} Use input iterators to build A from west0497.mtx. Have it print the number of nonzeroes in $A$ after buildMatrixUnique. Then modify the \texttt{main} function to take as the first program argument a path to a .mtx file, pass that path to the ALP/GraphBLAS program. Then find and download the west0497 matrix from the SuiteSparse matrix collection, and run the application. If all went well, the output should be something like:
203+
\begin{lstlisting}
204+
Info: grb::init (reference) called.
205+
elements in x: 497
206+
elements in y: 1
207+
y[ 200 ] = 1
208+
Info: MatrixMarket file detected. Header line: ``%%MatrixMarket matrix coordinate real general''
209+
Info: MatrixFileReader constructed for /home/yzelman/Documents/datasets/graphs-and-matrices/west0497.mtx: an 497 times 497 matrix holding 1727 entries. Type is MatrixMarket and the input is general.
210+
First parsed entry: ( 495, 496 ) = 0.897354
211+
nonzeroes in A: 1727
212+
Info: grb::finalize (reference) called.
213+
\end{lstlisting}
174214

175215
\subsection{Semirings and Algebraic Operations}
176216

0 commit comments

Comments
 (0)