Skip to content

Commit dd375c6

Browse files
author
doyougnu
committed
lambda lifting chapter done
1 parent 1de2d3d commit dd375c6

File tree

3 files changed

+133
-56
lines changed

3 files changed

+133
-56
lines changed

bib/book.bib

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,22 @@
1+
@inproceedings{compilingWithoutCont,
2+
author = {Maurer, Luke and Downen, Paul and Ariola, Zena M. and Peyton Jones, Simon},
3+
title = {Compiling without Continuations},
4+
year = {2017},
5+
isbn = {9781450349888},
6+
publisher = {Association for Computing Machinery},
7+
address = {New York, NY, USA},
8+
url = {https://doi.org/10.1145/3062341.3062380},
9+
doi = {10.1145/3062341.3062380},
10+
abstract = {Many fields of study in compilers give rise to the concept of a join point—a place where different execution paths come together. Join points are often treated as functions or continuations, but we believe it is time to study them in their own right. We show that adding join points to a direct-style functional intermediate language is a simple but powerful change that allows new optimizations to be performed, including a significant improvement to list fusion. Finally, we report on recent work on adding join points to the intermediate language of the Glasgow Haskell Compiler.},
11+
booktitle = {Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation},
12+
pages = {482–494},
13+
numpages = {13},
14+
keywords = {CPS, ANF, Haskell, GHC, list fusion, intermediate languages},
15+
location = {Barcelona, Spain},
16+
series = {PLDI 2017}
17+
}
18+
19+
120
@article{jones1992implementing,
221
author = {Jones, Peyton and L, Simon and Peyton Jones, Simon},
322
title = {Implementing Lazy Functional Languages on Stock Hardware: The Spineless Tagless G-machine},
@@ -241,3 +260,20 @@ @misc{selectiveLambdaLifting
241260
year = {2019},
242261
copyright = {arXiv.org perpetual, non-exclusive license}
243262
}
263+
264+
@inproceedings{fastCurry,
265+
author = {Marlow, Simon and Jones, Simon Peyton},
266+
title = {Making a Fast Curry: Push/Enter vs. Eval/Apply for Higher-Order Languages},
267+
year = {2004},
268+
isbn = {1581139055},
269+
publisher = {Association for Computing Machinery},
270+
address = {New York, NY, USA},
271+
url = {https://doi.org/10.1145/1016850.1016856},
272+
doi = {10.1145/1016850.1016856},
273+
abstract = {Higher-order languages that encourage currying are implemented using one of two basic evaluation models: push/enter or eval/apply. Implementors use their intuition and qualitative judgements to choose one model or the other.Our goal in this paper is to provide, for the first time, a more substantial basis for this choice, based on our qualitative and quantitative experience of implementing both models in a state-of-the-art compiler for Haskell.Our conclusion is simple, and contradicts our initial intuition: compiled implementations should use eval/apply.},
274+
booktitle = {Proceedings of the Ninth ACM SIGPLAN International Conference on Functional Programming},
275+
pages = {4–15},
276+
numpages = {12},
277+
location = {Snow Bird, UT, USA},
278+
series = {ICFP '04}
279+
}

src/Optimizations/GHC_opt/lambda_lifting.rst

Lines changed: 13 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,9 @@ simply reference it; no closures needed!
6161

6262
The fundamental tradeoff is decreased heap allocation for an increase in
6363
function parameters at each call site. This means that lambda lifting trades
64-
heap for stack and is not always a performance win. See `When to Manually
65-
Apply Lambda Lifting`_ for guidance on recognizing when your program may
66-
benefit.
64+
heap for stack and is not always a performance win. See :ref:`When to
65+
Manually Apply Lambda Lifting <when>` for guidance on recognizing when your
66+
program may benefit.
6767

6868

6969
How Lambda Lifting Works in GHC
@@ -89,8 +89,7 @@ details.
8989

9090
GHC does not lambda lift:
9191

92-
#. :term:`Top-level` bindings. By definition these
93-
cannot be lifted.
92+
#. A :term:`Top-level binding`. By definition these cannot be lifted.
9493
#. :term:`Thunk` and Data Constructors. Lifting either of these would destroy
9594
sharing.
9695
#. :term:`Join Point` because there is no lifting possible in a join point.
@@ -127,6 +126,8 @@ syntactic changes:
127126
#. All non-top-level variables (i.e., free variables) in the let's body become
128127
occurrences of parameters.
129128

129+
.. _when:
130+
130131
When to Manually Lambda Lift
131132
----------------------------
132133

@@ -205,11 +206,13 @@ function is called and allocated.
205206
function. Top level functions are always only allocated once.
206207

207208
The next determining factor is counting the number of new parameters that will
208-
be passed to lifted function. Should this number become greater than the number
209-
of available argument registers on the target platform then you'll incur slow
210-
downs in the STG machine......
211-
212-
tomorrow: update glossary, genapply and calling conventions. start here
209+
be passed to the lifted function. Should this number become greater than the
210+
number of available argument registers on the target platform then you'll incur
211+
slow downs in the STG machine. These slowdowns result from more work the STG
212+
machine will need to do. It will need to generate code that pops arguments from
213+
the stack instead of just applying the function to arguments that are already
214+
loaded into registers. In a hot loop this extra manipulation can have a large
215+
impact.
213216

214217
Summary
215218
-------
@@ -225,49 +228,5 @@ Summary
225228
#. To tell if your program has undergone lifting you can compare the Core with
226229
the STG. Or, you may compare STG with and without lifting explicitly enabled.
227230

228-
Testing Exec
229-
230-
.. exec::
231-
:context: false
232-
:process: haskell
233-
234-
module Main where
235-
236-
main :: IO ()
237-
main = do
238-
let x = fmap (+10) [1..10]
239-
print x
240-
241-
242-
Great that worked now lets try ``ghci``
243-
244-
245-
.. exec::
246-
:context: true
247-
:process: haskell
248-
:with: ghci
249-
250-
:t "Hello"
251-
252-
and also we can load a package
253-
254-
.. exec::
255-
:context: true
256-
:process: haskell
257-
:with: ghci
258-
259-
:m + Data.List
260-
:t span
261-
262-
and we can also run from cabal target!!
263-
264-
.. exec:: code/lethargy/bench/TooManyClosures.hs
265-
:context: true
266-
:process: haskell
267-
:project_dir: code/lethargy/
268-
:with: cabal
269-
:args: bench lethargy:tooManyClosures
270-
271-
272231
.. [#f1] This program and example comes from :cite:t:`selectiveLambdaLifting`;
273232
thank you for your labor!:

src/glossary.rst

Lines changed: 84 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ Glossary
55

66
.. glossary::
77

8+
Arity
9+
10+
The arity of a function is the number of arguments the function must take
11+
to conclude to a result.
812

913
Boxed : Levity
1014

@@ -101,19 +105,35 @@ Glossary
101105
heavily allocating CAFs can increase memory residency. See
102106
:cite:t:`jones1992implementing` Section 10.8 for more details.
103107

104-
105108
DWARF : Format
106109

107110
DWARF symbols are a widely used and standardized data format used to
108111
provide source level debugging. For more, see `the official webpage
109-
<https://dwarfstd.org/>`_
112+
<https://dwarfstd.org/>`_.
113+
114+
Entry Code
115+
116+
The entry code for a closure on the heap is the code that will evaluate
117+
that closure. There are some nuances and exceptions: For functions the
118+
entry code applies the function to its arguments, which the entry code
119+
assumes are all present; that is, the entry code assumes all arguments are
120+
either loaded into registers or are already on the stack. Should the
121+
function be applied to too few arguments or should the function be an
122+
:term:`Unknown function` then a generic apply is used. For a :term:`PAP`,
123+
there is no entry code. PAPs can only be applied to more arguments using
124+
the generic apply functions. Lastly, :term:`Unlifted` Objects cannot be
125+
evaluated and thus have no entry code.
110126

111127
Full Laziness transformation : Optimization
112128

113129
A form of :term:`Let Floating` which moves let bindings out of lambda
114130
abstractions to avoid unnecessary allocation and computation. See
115131
:cite:t:`peytonjones1997a` Section 7.2.
116132

133+
Fusion : Optimization
134+
135+
See :ref:`What is Fusion <canonical-fusion>`.
136+
117137
Info Table : Runtime
118138

119139
Every heap allocated object in the runtime system keeps an information
@@ -123,6 +143,41 @@ Glossary
123143
<https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/rts/storage/heap-objects#info-tables>`_
124144
for more details.
125145

146+
Join Point : Optimization
147+
148+
A join point is a place where different execution paths come together or
149+
*join*. Consider this example slightly modified from
150+
:cite:t:`compilingWithoutCont`:
151+
152+
.. code-block:: haskell
153+
154+
let join1 _ = some_large_expression
155+
join2 _ = some_other_large_expr
156+
in if e1 then (if e2 then join1 () else join2 ())
157+
else (if e3 then join1 () else join2 ())
158+
159+
In this example, ``join1`` and ``join2`` are join points because the
160+
branches described by each if-expression conclude by calling them. Thus,
161+
the control flow described by the if-expressions joins at specifically
162+
``join1`` and ``join2``. Join points are an important optimization
163+
technique that GHC performs automatically to remove redundant allocations.
164+
Had we not wrapped ``some_large_expression`` and ``some_other_large_expr``
165+
in a ``let``, then these expressions would be duplicated *and* would be
166+
captured in an additionally allocated closure unnecessarily. Join points
167+
avoid these problems and are particularly relevant for Stream
168+
:term:`Fusion` performance.
169+
170+
Known Function
171+
172+
A known function is a function in the STG machine of which GHC statically
173+
knows the :term:`Entry Code` pointer and the :term:`Arity` of. This means
174+
that the function binding site is statically visible, that is, the
175+
function is :term:`Top-Level`, or the function is bound by an enclosing
176+
``let``. With this information the STG machine can use a faster function
177+
application procedure because the function pointer does not need to be
178+
scrutinized. See also :term:`Unknown Function`.
179+
180+
126181
Levity Polymorphism
127182

128183
A kind of polymorphism that abstracts over calling conventions which
@@ -144,6 +199,16 @@ Glossary
144199
type is a set with three values: ``True``, ``False``, and :math:`\bot`.
145200
Therefore ``Bool`` is a Lifted type.
146201

202+
PAP
203+
204+
A PAP is a partial application. PAPs are heap objects and thus a type of
205+
closure that represents a function applied to *too few* arguments. PAPs
206+
should never be entered, and are only applied using the generic apply
207+
functions in the STG machine. See the file ``rts/Apply.cmm`` in GHC or the
208+
`heap object
209+
<https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/rts/storage/heap-objects>`_
210+
wiki page for more.
211+
147212
Pinned : Memory
148213

149214
Pinned memory is memory that is guaranteed to not be moved by GHC's garbage
@@ -162,6 +227,11 @@ Glossary
162227
provides Haskell's laziness. See :cite:t:`SpinelessTaglessGMachine`
163228
Section 3.1.2 for more details.
164229

230+
Top-level binding
231+
232+
A top level binding is any binding that exists in the most outer-most or
233+
global scope of the program.
234+
165235
Unboxed : Levity
166236

167237
An UnBoxed value is a value that is represented by the value itself.
@@ -172,6 +242,18 @@ Glossary
172242
An Unlifted type is a type where :math:`\bot` *is not* an element of that
173243
type. See :term:`Levity Polymorphism` and :term:`Lifted` types for more.
174244

245+
Unknown function
246+
247+
An unknown function is a function in the STG machine whose :term:`Entry
248+
Code` pointer and :term:`Arity` are not statically known by GHC. Unknown
249+
functions require GHC to generate code that first scrutinizes the function
250+
pointer to determine its arity and then dispatch to the normal function
251+
call handling procedures. This in known has a generic apply in the STG
252+
machine and is slower (due to needing to scrutinize the function) than a
253+
:term:`Known function`. See :cite:t:`fastCurry` for more details on STG
254+
calling conventions.
255+
256+
175257
WHNF : Normal Forms
176258

177259
An expression is in *weak head normal form* if it has been evaluated to

0 commit comments

Comments
 (0)