Skip to content

Commit 822e34b

Browse files
committed
merge
2 parents 37afc7a + 9dcb901 commit 822e34b

18 files changed

+291
-60
lines changed

main.tex

Lines changed: 144 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -284,7 +284,7 @@ \chapter{The Futhark Language}
284284
this program computes the dot product $\Sigma_{i} x_{i}\cdot{}y_{i}$
285285
of two vectors of integers:
286286

287-
\inplisting{src/dotprod.fut}
287+
\lstinputlisting[firstline=5]{src/dotprod.fut}
288288

289289
In Futhark, the notation for an array of element type $t$ is
290290
\texttt{[]$t$}. The program declares a function called \texttt{main}
@@ -572,6 +572,71 @@ \section{Array Operations}
572572
functional languages, they have implicitly parallel semantics, and
573573
some restrictions to preserve those semantics.
574574

575+
In addition to the array combinators, there are constructs for
576+
\textit{constructing} arrays. We already demonstrated literal arrays.
577+
Additionally, there is \texttt{iota}, which creates an array of a
578+
range of integers starting from zero:
579+
580+
\begin{lstlisting}
581+
iota 10 == [0,1,2,3,4,5,6,7,8,9]
582+
\end{lstlisting}
583+
584+
The name \texttt{iota} comes from APL, one of the earliest array
585+
programming languages, and is supposed to be mnemonic for creating
586+
\textit{index spaces} of arrays. Put another way, \texttt{iota n}
587+
produces an array of valid indices into an array of size \texttt{n}.
588+
589+
The \texttt{replicate} construct is used to create an array of some
590+
size, with all elements having the same given value:
591+
592+
\begin{lstlisting}
593+
replicate 3 42 == [42,42,42]
594+
\end{lstlisting}
595+
596+
We can use \texttt{concat} to combine several arrays:
597+
598+
\begin{lstlisting}
599+
concat (iota 2) ([1,2,3]) (replicate 4 1) ==
600+
[0,1,1,2,3,1,1,1,1i32]
601+
\end{lstlisting}
602+
603+
Note that the parentheses around the literal array are necessary - if
604+
they were not present, this expression would be parsed as an attempt
605+
to index the expression \texttt{iota 2} using \texttt{[1,2,3]} as the
606+
indices. This would of course result in a type error.
607+
608+
We can use \texttt{zip} to transform $n$ arrays to a single array of
609+
$n$-tuples:
610+
611+
\begin{lstlisting}
612+
zip [1,2,3] [true,false,true] [7.0,8.0,9.0] ==
613+
[(1,true,7.0),(2,false,8.0),(3,true,9.0)]
614+
\end{lstlisting}
615+
616+
That the input arrays may have different types. We can use
617+
\texttt{unzip} to perform the inverse transformation:
618+
619+
\begin{lstlisting}
620+
unzip [(1,true,7.0),(2,false,8.0),(3,true,9.0)] ==
621+
([1,2,3], [true,false,true], [7.0,8.0,9.0])
622+
\end{lstlisting}
623+
624+
Be aware that \texttt{zip} requires all of the input arrays to have
625+
the same size. Transforming between arrays of tuples and tuples of
626+
arrays is common in Futhark programs, as many array operations accept
627+
only one array as input. Due to a clever implementation technique,
628+
\texttt{zip} and \texttt{unzip} also have no runtime cost (no copying
629+
or allocation whatsoever), so you should not shy away from using them
630+
out of efficiency concerns.\footnote{This is enabled via storing all
631+
arrays in ``unzipped'' form. That is, at runtime, arrays of tuples
632+
do not exist, but have always been decomposed into multiple arrays.
633+
This is a common practice for high-performance computing, usually
634+
called ``structs of arrays'' versus ``arrays of structs'', and
635+
serves to permit memory access patterns more friendly to vectorised
636+
operations.}
637+
638+
\subsection{Map}
639+
575640
The simplest SOAC is probably \texttt{map}. It takes two arguments: a
576641
function and an array. The function argument can be a function name,
577642
or an anonymous function using \texttt{fn} syntax. The function is
@@ -604,16 +669,42 @@ \section{Array Operations}
604669
map (2-) [1,2,3] == [1,0,-1]
605670
\end{lstlisting}
606671

607-
While \texttt{map} accepts only a single array argument, there is a
608-
variation called \texttt{zipWith}, that takes any nonzero number of
609-
array arguments, and requires a function with the same number of
610-
parameters. For example, we can perform an element-wise sum of two
611-
arrays:
672+
In contrast to other languages, the \texttt{map} in Futhark takes any
673+
nonzero number of array arguments, and requires a function with the
674+
same number of parameters. For example, we can perform an
675+
element-wise sum of two arrays:
676+
677+
\begin{lstlisting}
678+
map (+) [1,2,3] [4,5,6] == [5,7,9]
679+
\end{lstlisting}
680+
681+
Be careful when writing \texttt{map} expressions where the function
682+
returns an array. Futhark requires regular arrays, so a map with
683+
\texttt{iota} is unlikely to go well:
612684

613685
\begin{lstlisting}
614-
zipWith (+) [1,2,3] [4,5,6] == [5,7,9]
686+
map (fn n => iota n) ns
615687
\end{lstlisting}
616688

689+
Unless the array \texttt{ns} consisted of identical values, the
690+
program would fail at runtime.
691+
692+
We can use \texttt{map} and \texttt{iota} to duplicate many other
693+
language constructs. For example, if we have two arrays
694+
\texttt{xs:[n]int} and \texttt{ys:[m]int}---that is, two integer
695+
arrays of sizes \texttt{n} and \texttt{m}---we can concatenate them
696+
using:
697+
698+
\lstinputlisting[firstline=2]{src/concat_with_map.fut}
699+
700+
However, it is not a good idea to write code like this, as it hinders
701+
the compiler from using high-level properties to do optimisation.
702+
Using \texttt{map}s over \texttt{iota}s with explicit indexing is
703+
usually only necessary when solving complicated irregular problems
704+
that cannot be represented directly.
705+
706+
\subsection{Scan and Reduce}
707+
617708
While \texttt{map} is an array transformer, the \texttt{reduce} SOAC
618709
is an array aggregator: it uses some function of type \texttt{t -> t
619710
-> t} to combine the elements of an array of type \texttt{[]t} to a
@@ -646,7 +737,7 @@ \section{Array Operations}
646737

647738
\begin{lstlisting}
648739
fun dotProd (xs: []int) (ys: []int): int =
649-
reduce (+) 0 (zipWith (*) xs ys)
740+
reduce (+) 0 (map (*) xs ys)
650741
\end{lstlisting}
651742

652743
A close cousin of \texttt{reduce} is \texttt{scan}, often called
@@ -679,6 +770,37 @@ \section{Array Operations}
679770
Several examples are discussed in
680771
Chapter~\ref{chap:parallel-algorithms}.
681772

773+
\subsection{Filtering}
774+
775+
We have seen \texttt{map}, which permits us to change all the elements
776+
of an array. We have seen \texttt{reduce}, which lets us collapse all
777+
the elements of an array. But we still need something that lets us
778+
remove some, but not all, of the elements of an array. This SOAC is
779+
\texttt{filter}, which behaves much like a filter in any other
780+
functional language:
781+
782+
\begin{lstlisting}
783+
filter (<3) [1,5,2,3,4] == [1,2]
784+
\end{lstlisting}
785+
786+
The use of \texttt{filter} is mostly straightforward, but there are
787+
some patterns that may appear subtle at first glance. For example,
788+
how do we find the \textit{indices} of all nonzero entries in an array
789+
of integers? Finding the values is simple enough:
790+
791+
\begin{lstlisting}
792+
filter (fn x => x != 0) [0,5,2,0,1] ==
793+
[5,2,1]
794+
\end{lstlisting}
795+
796+
But what are the corresponding indices? We can solve this using a
797+
combination of \texttt{zip}, \texttt{filter}, and \texttt{unzip}:
798+
799+
\lstinputlisting[firstline=7]{src/indices_of_nonzero.fut}
800+
801+
Be aware that \texttt{filter} is a somewhat expensive SOAC,
802+
corresponding roughly to a \texttt{scan} plus a \texttt{map}.
803+
682804
\section{Sequential Loops}
683805
\label{sec:sequential-loops}
684806

@@ -1026,7 +1148,7 @@ \section{Benchmarking}
10261148

10271149
Consider an implementation of dot product:
10281150

1029-
\inplisting{src/dotprod.fut}
1151+
\lstinputlisting[firstline=5]{src/dotprod.fut}
10301152

10311153
We previously mentioned that, for small data sets, sequential
10321154
execution is likely to be much faster than parallel execution. But
@@ -1348,7 +1470,7 @@ \section{Futhark---the Language}
13481470
| partition | rearrange | replicate | reshape
13491471
| rotate | shape | split | transpose | unzip | write | zip
13501472
$\id{soac}$ ::= map | reduce | reduceComm | scan | filter
1351-
| partition | zipWith
1473+
| partition
13521474
\end{lstlisting}
13531475

13541476
In the grammar for the Futhark language below, we have eluded both the
@@ -1411,10 +1533,9 @@ \section{Futhark Type System}
14111533
\begin{eqnarray*}
14121534
\id{soac} & : & \mathrm{TypeOf}(\id{soac}) \\
14131535
\fop{filter} & : & \forall \alpha. (\alpha \rarr \mathtt{bool}) \rarr []\alpha \rarr []\alpha\\
1414-
\fop{map} & : & \forall \alpha\beta. (\alpha \rarr \beta) \rarr []\alpha \rarr []\beta\\
1536+
\fop{map} & : & \forall \alpha_1\cdots\alpha_n\beta. (\alpha_1\rarr\cdots\rarr\alpha_n \rarr \beta) \rarr []\alpha_1 \rarr\cdots\rarr []\alpha_n \rarr []\beta\\
14151537
\fop{reduce} & : & \forall \alpha. (\alpha \rarr \alpha \rarr \alpha) \rarr \alpha \rarr []\alpha \rarr \alpha\\
14161538
\fop{scan} & : & \forall \alpha. (\alpha \rarr \alpha \rarr \alpha) \rarr \alpha \rarr []\alpha \rarr []\alpha\\
1417-
\fop{zipWith} & : & \forall \alpha_1\cdots\alpha_n\beta. (\alpha_1\rarr\cdots\rarr\alpha_n \rarr \beta) \rarr []\alpha_1 \rarr\cdots\rarr []\alpha_n \rarr []\beta
14181539
\end{eqnarray*}
14191540
\caption{Type schemes for Futhark's second-order array combinators (SOACs). The relation $\mathrm{TypeOf}(\id{soac}) = \sigma$.}
14201541
\label{fig:soactypeschemes}
@@ -1642,10 +1763,7 @@ \section{Futhark Evaluation Semantics}
16421763
\Eval{\kw{(}e_1,\cdots,e_n\kw{)}} & = & \kw{(}\Eval{e_1},\cdots,\Eval{e_n}\kw{)} \\
16431764
\Eval{e_1~\id{binop}_\tau~e_2} & = & \sem{\id{binop}_\tau}~\Eval{e_1}~\Eval{e_2} \\
16441765
\Eval{\id{op}_\tau~e_1\cdots e_n} & = & \sem{\id{op}_\tau}~\Eval{e_1}~\cdots~\Eval{e_n} \\
1645-
\Eval{\fop{map}~F~e} & = & \Eval{\kw{[}e'[v_1/x],\cdots,e'[v_n/x]\kw{]}} \\
1646-
& & ~~~\mathrm{where}~\lambda x . e' = \extractF{F} \\
1647-
& & ~~~~~\mathrm{and}~ \kw{[}v_1,\cdots,v_n\kw{]} = \Eval{e} \\
1648-
\Eval{\fop{zipWith}~F~e_1\cdots e_m} & = & \Eval{\kw{[}e'[v_1^1/x_1\cdots v_1^m/x_m],\cdots,e'[v_n^1/x_n\cdots v_n^m/x_m]\kw{]}} \\
1766+
\Eval{\fop{map}~F~e_1\cdots e_m} & = & \Eval{\kw{[}e'[v_1^1/x_1\cdots v_1^m/x_m],\cdots,e'[v_n^1/x_n\cdots v_n^m/x_m]\kw{]}} \\
16491767
& & ~~~\mathrm{where}~\lambda x_1\cdots x_m . e' = \extractF{F} \\
16501768
& & ~~~~~\mathrm{and}~ \kw{[}v_1^i,\cdots,v_n^i\kw{]} = \Eval{e_i} ~~~ i=[1..m]
16511769
\end{eqnarray*}
@@ -1677,14 +1795,15 @@ \section{Work and Span}
16771795
operations done by the big-step evalutation semantics, and the
16781796
\emph{span} of the program execution, in terms of the maximum depth of
16791797
the computation, assuming an infinite amount of parallelism in the
1680-
SOAC computations. The functions for work and span, denoted by $W :
1681-
\mathrm{Exp} \rightarrow \N$ and $S : \mathrm{Exp} \rightarrow \N$ are
1682-
given in Figure~\ref{fig:work} and Figure~\ref{fig:span},
1683-
respectively. The functions are defined independently, although they
1684-
make use of the evaluation function $\Eval{\cdot}$. We have given the
1685-
definitions for the essential SOAC functions, namely \fop{map} and
1686-
\fop{reduce}. The definitions for the remaining SOACs, such as
1687-
\fop{zipWith}, follow the same lines as the definitions for \fop{map} and \fop{reduce}.
1798+
SOAC computations. The functions for work and span, denoted by
1799+
$W : \mathrm{Exp} \rightarrow \N$ and
1800+
$S : \mathrm{Exp} \rightarrow \N$ are given in Figure~\ref{fig:work}
1801+
and Figure~\ref{fig:span}, respectively. The functions are defined
1802+
independently, although they make use of the evaluation function
1803+
$\Eval{\cdot}$. We have given the definitions for the essential SOAC
1804+
functions, namely \fop{map} and \fop{reduce}. The definitions for the
1805+
remaining SOACs follow the same lines as the definitions for \fop{map}
1806+
and \fop{reduce}.
16881807

16891808
\begin{figure}
16901809
\begin{lstlisting}[mathescape=true]
@@ -1774,7 +1893,7 @@ \section{Reduction by Contraction}
17741893
argument vector \kw{xs} with neutral elements to ensure that its size
17751894
is a power of two. It then implements a sequential loop with the
17761895
contraction step as its loop body, implemented by a
1777-
parallel $\fop{zipWith}$ over an appropriately splitted input vector.
1896+
parallel $\fop{map}$ over an appropriately splitted input vector.
17781897

17791898
The auxiliary function for padding the input vector is implemented by the following code:
17801899

src/.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,5 @@
33
*.exe
44
*.out
55
*-opencl
6-
*-c
6+
*-c
7+
*.bin

src/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ FUTHARKOPENCL ?= futhark-opencl
44
#FUTFILES=$(wildcard *.fut)
55

66
SRCFILES=radix_sort sgm_scan reduce_contract find_idx streak sgm_streak rsort_idx maxidx
7-
SRCFILES_INPUT=dotprod multable primes rsort
7+
SRCFILES_INPUT=multable primes rsort
88

99
RESFILES=$(SRCFILES:%=%.res) $(SRCFILES_INPUT:%=%_inp.res)
1010
RESOPENCLFILES=$(SRCFILES:%=%.resopencl)

src/concat_with_map.fut

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
fun main (xs: [n]int) (ys: [m]int): []int =
2+
map (fn i => if i < n then xs[i] else ys[i-n])
3+
(iota (n+m))

src/dotprod.fut

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,6 @@
1+
-- ==
2+
-- input { [1,2,3] [4,5,6] }
3+
-- output { 32 }
4+
15
fun main (x: []int) (y: []int): int =
2-
reduce (+) 0 (zipWith (*) x y)
6+
reduce (+) 0 (map (*) x y)

src/dotprod.inp

Lines changed: 0 additions & 1 deletion
This file was deleted.

src/dotprod_inp.ok

Lines changed: 0 additions & 1 deletion
This file was deleted.

src/find_idx.fut

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,11 @@ fun min (a:i32) (b:i32) : i32 = if a < b then a else b
88

99
-- Return the first index i into xs for which xs[i] == e
1010
fun find_idx_first (e:i32) (xs:[n]i32) : i32 =
11-
let es = zipWith (fn x i => if x==e then i else n) xs (iota n)
11+
let es = map (fn x i => if x==e then i else n) xs (iota n)
1212
let res = reduce min n es
1313
in if res == n then -1 else res
1414

1515
-- Return the last index i into xs for which xs[i] == e
1616
fun find_idx_last (e:i32) (xs:[n]i32) : i32 =
17-
let es = zipWith (fn x i => if x==e then i else -1) xs (iota n)
17+
let es = map (fn x i => if x==e then i else -1) xs (iota n)
1818
in reduce max (-1) es

src/indices_of_nonzero.fut

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
-- ==
2+
-- input { [0,5,2,0,1] }
3+
-- output { [1,2,4] }
4+
5+
fun main(xs: [n]int): []int = indices_of_nonzero xs
6+
7+
fun indices_of_nonzero(xs: [n]int): []int =
8+
let xs_and_is = zip xs (iota n)
9+
let xs_and_is' = filter (fn (x,_) => x != 0) xs_and_is
10+
let (_, is') = unzip xs_and_is'
11+
in is'

src/lines.fut

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -78,19 +78,19 @@ fun drawlines_par (grid:*[h][w]i32) (lines:[n]line) :[h][w]i32 =
7878
let ys1 = map (fn i => ys1[i]) idxs
7979
let xs2 = map (fn i => xs2[i]) idxs
8080
let ys2 = map (fn i => ys2[i]) idxs
81-
let dirxs = zipWith (fn x1 x2 =>
81+
let dirxs = map (fn x1 x2 =>
8282
if x2 > x1 then 1
8383
else if x1 > x2 then -1
8484
else 0) xs1 xs2
85-
let slops = zipWith (fn x1 y1 x2 y2 =>
85+
let slops = map (fn x1 y1 x2 y2 =>
8686
if x2 == x1 then
8787
if y2 > y1 then f32(1) else f32(-1)
8888
else f32(y2-y1) / abs(f32(x2-x1))) xs1 ys1 xs2 ys2
8989
let iotas = sgmIota flags
90-
let xs = zipWith (fn x1 dirx i =>
90+
let xs = map (fn x1 dirx i =>
9191
x1+dirx*i) xs1 dirxs iotas
92-
let ys = zipWith (fn y1 slop i =>
92+
let ys = map (fn y1 slop i =>
9393
y1+i32(slop*f32(i))) ys1 slops iotas
94-
let is = zipWith (fn x y => w*y+x) xs ys
94+
let is = map (fn x y => w*y+x) xs ys
9595
let flatgrid = reshape (h*w) grid
9696
in reshape (h,w) (write is (replicate nn 1) flatgrid)

0 commit comments

Comments
 (0)