1
1
.. _Lambda Lifting Chapter :
2
2
3
+ ..
3
4
Local Variables
4
5
.. |glift | replace :: ``g_lifted ``
5
6
@@ -16,7 +17,7 @@ contains closures, rather it only references global names.
16
17
A Working Example
17
18
-----------------
18
19
19
- Consider the following program [# ]_:
20
+ Consider the following program [#f1 ]_:
20
21
21
22
.. code-block :: haskell
22
23
@@ -59,9 +60,10 @@ simply reference it; no closures needed!
59
60
.. note ::
60
61
61
62
The fundamental tradeoff is decreased heap allocation for an increase in
62
- function parameters at each call site. This means that lambda lifting is not
63
- always a performance win. See `When to Manually Apply Lambda Lifting `_ for
64
- guidance on recognizing when your program may benefit.
63
+ function parameters at each call site. This means that lambda lifting trades
64
+ heap for stack and is not always a performance win. See `When to Manually
65
+ Apply Lambda Lifting `_ for guidance on recognizing when your program may
66
+ benefit.
65
67
66
68
67
69
How Lambda Lifting Works in GHC
@@ -125,11 +127,103 @@ syntactic changes:
125
127
#. All non-top-level variables (i.e., free variables) in the let's body become
126
128
occurrences of parameters.
127
129
128
- When to Manually Apply Lambda Lifting
129
- -------------------------------------
130
+ When to Manually Lambda Lift
131
+ ----------------------------
130
132
131
- tomorrow: update glossary, start here
133
+ GHC does a good job finding beneficial instances of lambda lifting. However, you
134
+ might want to manually lambda lift to save compile time, or to increase
135
+ the performance of your without relying on GHC's optimizer.
132
136
137
+ There are three considerations you should have when deciding when to manually
138
+ lambda lift:
139
+
140
+ 1. Are the functions that would be lifted in hot loops.
141
+ 2. How many more parameters would be passed to these functions.
142
+ 3. Would this transformation sacrifice readability and maintainability.
143
+
144
+ Let's take these in order: (1) lambda lifting trades heap (the let bindings that
145
+ it removes), for stack (the increased function parameters). Thus it is not
146
+ always a performance win and in some cases can be a performance loss. The losses
147
+ occur when existing closures grow as a result of the lambda lift. This extra
148
+ allocation slows the program down and increases pressure on the garbage
149
+ collector. Consider this example from :cite:t: `selectiveLambdaLifting `:
150
+
151
+ .. code-block :: haskell
152
+
153
+ -- unlifted.
154
+
155
+ -- f's increases heap because it must have a closure that includes the 'x'
156
+ -- and 'y' free variables
157
+
158
+ -- 'g' increases heap because of the let and must have 'f' and 'x' in its
159
+ -- closure (not assuming other optimizations such as constant propagation)
160
+
161
+ -- 'h' increases heap because 'f' is free in 'h'
162
+
163
+ let f a b = a + x + b + y
164
+ g d = let h e = f e e
165
+ in h x
166
+ in g 1 + g 2 + g 3
167
+
168
+ Let's say we lift ``f ``, now we have:
169
+
170
+
171
+ .. code-block :: haskell
172
+
173
+ -- lifted f
174
+
175
+ f_lifted x y a b = a + x + b + y
176
+
177
+ let g d = let h e = f_lifted x y e e
178
+ in h x
179
+ in g 1 + g 2 + g 3
180
+
181
+ ``f_lifted `` is now a top level function, thus any closure that contained ``f ``
182
+ before the lift will save one slot of memory. With ``f_lifted `` we additionally
183
+ save two slots of memory because ``x `` and ``y `` are now parameters. Thus
184
+ ``f_lifted `` does not need to allocate a closure with :term: `Closure
185
+ Conversion `. ``g ``'s allocations do not change since ``f_lifted `` can be
186
+ directly referenced just as before and because ``x `` is still free in ``g ``.
187
+ Thus ``g ``'s closure will contain ``x `` and ``f_lifted `` will be inlined, same
188
+ as ``f `` in the unlifted version. ``h ``'s allocations grow by one slot since
189
+ ``y `` *is now also * free in ``h ``, just as ``x `` was. So it would seem that in
190
+ total lambda lifting ``f `` saves one slot of memory because two slots were lost
191
+ in ``f `` and one was gained in ``h ``. However, ``g `` is a :term: `multi-shot `
192
+ lambda, thus ``h `` will be allocated *for each * call of ``g ``, whereas ``f `` and
193
+ ``g `` are only allocated once. Therefore the lift is a net loss.
194
+
195
+ This example illustrates how tricky good lifts can be and especially for hot
196
+ loops. In general, you should try to train your eye to determine when to
197
+ manually lift. Try to roughly determine allocations by counting the ``let ``
198
+ expressions, the number of free variables, and the likely number of times a
199
+ function is called and allocated.
200
+
201
+ .. note ::
202
+
203
+ Recall, due to closure conversion GHC allocates one slot of memory for each
204
+ free variable. Local functions are allocated *once per call * of the enclosing
205
+ function. Top level functions are always only allocated once.
206
+
207
+ The next determining factor is counting the number of new parameters that will
208
+ be passed to lifted function. Should this number become greater than the number
209
+ of available argument registers on the target platform then you'll incur slow
210
+ downs in the STG machine......
211
+
212
+ tomorrow: update glossary, genapply and calling conventions. start here
213
+
214
+ Summary
215
+ -------
216
+
217
+ #. Lambda lifting is a classic optimization technique for compiling local
218
+ functions and removing free variables.
219
+ #. Lambda lifting trades heap for stack and is therefore effective for tight,
220
+ closed, hot loops where fetching from the heap would be slow.
221
+ #. GHC automatically performs lambda lifting, but does so only selectively. This
222
+ transformation is late in the compilation pipeline at STG and right before
223
+ code generation. GHC's lambda lifting transformation can be toggled via the
224
+ ``-f-stg-lift-lams `` and ``-fno-stg-lift-lams `` flags.
225
+ #. To tell if your program has undergone lifting you can compare the Core with
226
+ the STG. Or, you may compare STG with and without lifting explicitly enabled.
133
227
134
228
Testing Exec
135
229
@@ -175,5 +269,5 @@ and we can also run from cabal target!!
175
269
:args: bench lethargy:tooManyClosures
176
270
177
271
178
- .. [# ] This program and example comes from Sebastian Graf and Simon Peyton Jones
179
- :cite:p: ` selectiveLambdaLifting `; thank you for your labor!:
272
+ .. [#f1 ] This program and example comes from :cite:t: ` selectiveLambdaLifting `;
273
+ thank you for your labor!:
0 commit comments