@@ -284,7 +284,7 @@ \chapter{The Futhark Language}
284284this program computes the dot product $ \Sigma _{i} x_{i}\cdot {}y_{i}$
285285of two vectors of integers:
286286
287- \inplisting {src/dotprod.fut}
287+ \lstinputlisting [firstline=5] {src/dotprod.fut}
288288
289289In Futhark, the notation for an array of element type $ t$ is
290290\texttt {[]$ t$ }. The program declares a function called \texttt {main }
@@ -572,6 +572,71 @@ \section{Array Operations}
572572functional languages, they have implicitly parallel semantics, and
573573some restrictions to preserve those semantics.
574574
575+ In addition to the array combinators, there are constructs for
576+ \textit {constructing } arrays. We already demonstrated literal arrays.
577+ Additionally, there is \texttt {iota }, which creates an array of a
578+ range of integers starting from zero:
579+
580+ \ begin{lstlisting}
581+ iota 10 == [0,1,2,3,4,5,6,7,8,9]
582+ \end {lstlisting }
583+
584+ The name \texttt {iota } comes from APL, one of the earliest array
585+ programming languages, and is supposed to be mnemonic for creating
586+ \textit {index spaces } of arrays. Put another way, \texttt {iota n }
587+ produces an array of valid indices into an array of size \texttt {n }.
588+
589+ The \texttt {replicate } construct is used to create an array of some
590+ size, with all elements having the same given value:
591+
592+ \ begin{lstlisting}
593+ replicate 3 42 == [42,42,42]
594+ \end {lstlisting }
595+
596+ We can use \texttt {concat } to combine several arrays:
597+
598+ \ begin{lstlisting}
599+ concat (iota 2) ([1,2,3]) (replicate 4 1) ==
600+ [0,1,1,2,3,1,1,1,1i32]
601+ \end {lstlisting }
602+
603+ Note that the parentheses around the literal array are necessary - if
604+ they were not present, this expression would be parsed as an attempt
605+ to index the expression \texttt {iota 2 } using \texttt {[1,2,3] } as the
606+ indices. This would of course result in a type error.
607+
608+ We can use \texttt {zip } to transform $ n$ arrays to a single array of
609+ $ n$ -tuples:
610+
611+ \ begin{lstlisting}
612+ zip [1,2,3] [true,false,true] [7.0,8.0,9.0] ==
613+ [(1,true,7.0),(2,false,8.0),(3,true,9.0)]
614+ \end {lstlisting }
615+
616+ That the input arrays may have different types. We can use
617+ \texttt {unzip } to perform the inverse transformation:
618+
619+ \ begin{lstlisting}
620+ unzip [(1,true,7.0),(2,false,8.0),(3,true,9.0)] ==
621+ ([1,2,3], [true,false,true], [7.0,8.0,9.0])
622+ \end {lstlisting }
623+
624+ Be aware that \texttt {zip } requires all of the input arrays to have
625+ the same size. Transforming between arrays of tuples and tuples of
626+ arrays is common in Futhark programs, as many array operations accept
627+ only one array as input. Due to a clever implementation technique,
628+ \texttt {zip } and \texttt {unzip } also have no runtime cost (no copying
629+ or allocation whatsoever), so you should not shy away from using them
630+ out of efficiency concerns.\footnote {This is enabled via storing all
631+ arrays in `` unzipped'' form. That is, at runtime, arrays of tuples
632+ do not exist, but have always been decomposed into multiple arrays.
633+ This is a common practice for high-performance computing, usually
634+ called `` structs of arrays'' versus `` arrays of structs'' , and
635+ serves to permit memory access patterns more friendly to vectorised
636+ operations.}
637+
638+ \subsection {Map }
639+
575640The simplest SOAC is probably \texttt {map }. It takes two arguments: a
576641function and an array. The function argument can be a function name,
577642or an anonymous function using \texttt {fn } syntax. The function is
@@ -604,16 +669,42 @@ \section{Array Operations}
604669map (2-) [1,2,3] == [1,0,-1]
605670\end {lstlisting }
606671
607- While \texttt {map } accepts only a single array argument, there is a
608- variation called \texttt {zipWith }, that takes any nonzero number of
609- array arguments, and requires a function with the same number of
610- parameters. For example, we can perform an element-wise sum of two
611- arrays:
672+ In contrast to other languages, the \texttt {map } in Futhark takes any
673+ nonzero number of array arguments, and requires a function with the
674+ same number of parameters. For example, we can perform an
675+ element-wise sum of two arrays:
676+
677+ \ begin{lstlisting}
678+ map (+) [1,2,3] [4,5,6] == [5,7,9]
679+ \end {lstlisting }
680+
681+ Be careful when writing \texttt {map } expressions where the function
682+ returns an array. Futhark requires regular arrays, so a map with
683+ \texttt {iota } is unlikely to go well:
612684
613685\ begin{lstlisting}
614- zipWith (+) [1,2,3] [4,5,6] == [5,7,9]
686+ map (fn n => iota n) ns
615687\end {lstlisting }
616688
689+ Unless the array \texttt {ns } consisted of identical values, the
690+ program would fail at runtime.
691+
692+ We can use \texttt {map } and \texttt {iota } to duplicate many other
693+ language constructs. For example, if we have two arrays
694+ \texttt {xs:[n]int } and \texttt {ys:[m]int }---that is, two integer
695+ arrays of sizes \texttt {n } and \texttt {m }---we can concatenate them
696+ using:
697+
698+ \lstinputlisting [firstline=2]{src/concat_with_map.fut}
699+
700+ However, it is not a good idea to write code like this, as it hinders
701+ the compiler from using high-level properties to do optimisation.
702+ Using \texttt {map }s over \texttt {iota }s with explicit indexing is
703+ usually only necessary when solving complicated irregular problems
704+ that cannot be represented directly.
705+
706+ \subsection {Scan and Reduce }
707+
617708While \texttt {map } is an array transformer, the \texttt {reduce } SOAC
618709is an array aggregator: it uses some function of type \texttt {t -> t
619710 -> t } to combine the elements of an array of type \texttt {[]t } to a
@@ -646,7 +737,7 @@ \section{Array Operations}
646737
647738\ begin{lstlisting}
648739fun dotProd (xs: []int) (ys: []int): int =
649- reduce (+) 0 (zipWith (*) xs ys)
740+ reduce (+) 0 (map (*) xs ys)
650741\end {lstlisting }
651742
652743A close cousin of \texttt {reduce } is \texttt {scan }, often called
@@ -679,6 +770,37 @@ \section{Array Operations}
679770Several examples are discussed in
680771Chapter~\ref {chap:parallel-algorithms }.
681772
773+ \subsection {Filtering }
774+
775+ We have seen \texttt {map }, which permits us to change all the elements
776+ of an array. We have seen \texttt {reduce }, which lets us collapse all
777+ the elements of an array. But we still need something that lets us
778+ remove some, but not all, of the elements of an array. This SOAC is
779+ \texttt {filter }, which behaves much like a filter in any other
780+ functional language:
781+
782+ \ begin{lstlisting}
783+ filter (<3) [1,5,2,3,4] == [1,2]
784+ \end {lstlisting }
785+
786+ The use of \texttt {filter } is mostly straightforward, but there are
787+ some patterns that may appear subtle at first glance. For example,
788+ how do we find the \textit {indices } of all nonzero entries in an array
789+ of integers? Finding the values is simple enough:
790+
791+ \ begin{lstlisting}
792+ filter (fn x => x != 0) [0,5,2,0,1] ==
793+ [5,2,1]
794+ \end {lstlisting }
795+
796+ But what are the corresponding indices? We can solve this using a
797+ combination of \texttt {zip }, \texttt {filter }, and \texttt {unzip }:
798+
799+ \lstinputlisting [firstline=7]{src/indices_of_nonzero.fut}
800+
801+ Be aware that \texttt {filter } is a somewhat expensive SOAC,
802+ corresponding roughly to a \texttt {scan } plus a \texttt {map }.
803+
682804\section {Sequential Loops }
683805\label {sec:sequential-loops }
684806
@@ -1026,7 +1148,7 @@ \section{Benchmarking}
10261148
10271149Consider an implementation of dot product:
10281150
1029- \inplisting {src/dotprod.fut}
1151+ \lstinputlisting [firstline=5] {src/dotprod.fut}
10301152
10311153We previously mentioned that, for small data sets, sequential
10321154execution is likely to be much faster than parallel execution. But
@@ -1348,7 +1470,7 @@ \section{Futhark---the Language}
13481470 | partition | rearrange | replicate | reshape
13491471 | rotate | shape | split | transpose | unzip | write | zip
13501472 $\id{soac}$ ::= map | reduce | reduceComm | scan | filter
1351- | partition | zipWith
1473+ | partition
13521474\end {lstlisting }
13531475
13541476In the grammar for the Futhark language below, we have eluded both the
@@ -1411,10 +1533,9 @@ \section{Futhark Type System}
14111533 \begin {eqnarray* }
14121534\id {soac} & : & \mathrm {TypeOf}(\id {soac}) \\
14131535 \fop {filter} & : & \forall \alpha . (\alpha \rarr \mathtt {bool}) \rarr []\alpha \rarr []\alpha \\
1414- \fop {map} & : & \forall \alpha\ beta . (\alpha \rarr \beta ) \rarr []\alpha \rarr []\beta \\
1536+ \fop {map} & : & \forall \alpha _1 \cdots\alpha _n \ beta . (\alpha _1 \rarr\cdots\rarr\alpha _n \rarr \beta ) \rarr []\alpha _1 \rarr\cdots\rarr [] \alpha _n \rarr []\beta \\
14151537 \fop {reduce} & : & \forall \alpha . (\alpha \rarr \alpha \rarr \alpha ) \rarr \alpha \rarr []\alpha \rarr \alpha \\
14161538 \fop {scan} & : & \forall \alpha . (\alpha \rarr \alpha \rarr \alpha ) \rarr \alpha \rarr []\alpha \rarr []\alpha \\
1417- \fop {zipWith} & : & \forall \alpha _1\cdots\alpha _n\beta . (\alpha _1\rarr\cdots\rarr\alpha _n \rarr \beta ) \rarr []\alpha _1 \rarr\cdots\rarr []\alpha _n \rarr []\beta
14181539 \end {eqnarray* }
14191540 \caption {Type schemes for Futhark's second-order array combinators (SOACs). The relation $ \mathrm {TypeOf}(\id {soac}) = \sigma $ .}
14201541 \label {fig:soactypeschemes }
@@ -1642,10 +1763,7 @@ \section{Futhark Evaluation Semantics}
16421763 \Eval {\kw {(}e_1,\cdots ,e_n\kw {)}} & = & \kw {(}\Eval {e_1},\cdots ,\Eval {e_n}\kw {)} \\
16431764 \Eval {e_1~\id {binop}_\tau ~e_2} & = & \sem {\id {binop}_\tau }~\Eval {e_1}~\Eval {e_2} \\
16441765 \Eval {\id {op}_\tau ~e_1\cdots e_n} & = & \sem {\id {op}_\tau }~\Eval {e_1}~\cdots ~\Eval {e_n} \\
1645- \Eval {\fop {map}~F~e} & = & \Eval {\kw {[}e'[v_1/x],\cdots ,e'[v_n/x]\kw {]}} \\
1646- & & ~~~\mathrm {where}~\lambda x . e' = \extractF {F} \\
1647- & & ~~~~~\mathrm {and}~ \kw {[}v_1,\cdots ,v_n\kw {]} = \Eval {e} \\
1648- \Eval {\fop {zipWith}~F~e_1\cdots e_m} & = & \Eval {\kw {[}e'[v_1^1/x_1\cdots v_1^m/x_m],\cdots ,e'[v_n^1/x_n\cdots v_n^m/x_m]\kw {]}} \\
1766+ \Eval {\fop {map}~F~e_1\cdots e_m} & = & \Eval {\kw {[}e'[v_1^1/x_1\cdots v_1^m/x_m],\cdots ,e'[v_n^1/x_n\cdots v_n^m/x_m]\kw {]}} \\
16491767 & & ~~~\mathrm {where}~\lambda x_1\cdots x_m . e' = \extractF {F} \\
16501768 & & ~~~~~\mathrm {and}~ \kw {[}v_1^i,\cdots ,v_n^i\kw {]} = \Eval {e_i} ~~~ i=[1..m]
16511769\end {eqnarray* }
@@ -1677,14 +1795,15 @@ \section{Work and Span}
16771795operations done by the big-step evalutation semantics, and the
16781796\emph {span } of the program execution, in terms of the maximum depth of
16791797the computation, assuming an infinite amount of parallelism in the
1680- SOAC computations. The functions for work and span, denoted by $ W :
1681- \mathrm {Exp} \rightarrow \N $ and $ S : \mathrm {Exp} \rightarrow \N $ are
1682- given in Figure~\ref {fig:work } and Figure~\ref {fig:span },
1683- respectively. The functions are defined independently, although they
1684- make use of the evaluation function $ \Eval {\cdot }$ . We have given the
1685- definitions for the essential SOAC functions, namely \fop {map} and
1686- \fop {reduce}. The definitions for the remaining SOACs, such as
1687- \fop {zipWith}, follow the same lines as the definitions for \fop {map} and \fop {reduce}.
1798+ SOAC computations. The functions for work and span, denoted by
1799+ $ W : \mathrm {Exp} \rightarrow \N $ and
1800+ $ S : \mathrm {Exp} \rightarrow \N $ are given in Figure~\ref {fig:work }
1801+ and Figure~\ref {fig:span }, respectively. The functions are defined
1802+ independently, although they make use of the evaluation function
1803+ $ \Eval {\cdot }$ . We have given the definitions for the essential SOAC
1804+ functions, namely \fop {map} and \fop {reduce}. The definitions for the
1805+ remaining SOACs follow the same lines as the definitions for \fop {map}
1806+ and \fop {reduce}.
16881807
16891808\begin {figure }
16901809\ begin{lstlisting} [mathescape=true]
@@ -1774,7 +1893,7 @@ \section{Reduction by Contraction}
17741893argument vector \kw {xs} with neutral elements to ensure that its size
17751894is a power of two. It then implements a sequential loop with the
17761895contraction step as its loop body, implemented by a
1777- parallel $ \fop {zipWith }$ over an appropriately splitted input vector.
1896+ parallel $ \fop {map }$ over an appropriately splitted input vector.
17781897
17791898The auxiliary function for padding the input vector is implemented by the following code:
17801899
0 commit comments