|
1 | | -# Instantiation and abstraction |
| 1 | +# 实例化与抽象化 |
2 | 2 |
|
3 | | -Instantiation refers to substitution of bound variables for the appropriate arguments. Abstraction refers to replacement of free variables with the appropriate bound variable when replacing binders. Lean's kernel uses deBruijn indices for bound variables and unique identifiers for free variables. |
| 3 | +实例化(Instantiation)指的是用合适的参数替换掉表达式中的绑定变量的过程。 |
| 4 | +抽象化(Abstraction)则相反,指的是在重新绑定时,用合适的绑定变量替换表达式中的自由变量。 |
4 | 5 |
|
5 | | -For our purposes, a free variable is a variable in an expression that refers to a binder which has been "opened", and is no longer immediately available to us, so we replace the corresponding bound variable with a free variable that has some information about the binder we're leaving behind. |
| 6 | +Lean 内核采用: |
6 | 7 |
|
7 | | -To illustrate, let's say we have some lambda expression `(fun (x : A) => <body>)` and we're doing type inference. Type inference has to traverse into the `<body>` part of the expression, which may contain a bound variable that refers to `x`. When we traverse into the body, we can either add `x` to some stateful context of binders and take the whole stateful context into the body with us, or we can temporarily replace all of the bound variables that refer to `x` with a free variable, allowing us to traverse into the body without having to carry any additional context. |
| 8 | +* **deBruijn 索引** 来表示绑定变量; |
| 9 | +* **唯一标识符**(unique identifiers)来表示自由变量。 |
8 | 10 |
|
9 | | -If we eventually come back to where we were before we opened the binder, abstraction allows us to replace all of the free variables that were once bound variables referring to `x` with new bound variables that again refer to `x`, with the correct deBruijn indices. |
| 11 | +在这里,**自由变量** 指的是表达式中原本引用某个绑定变量,但该绑定变量所处的绑定器已被“打开”,不再立即可用。因此,我们用自由变量来临时替代这个绑定变量,并记录一些与原绑定器相关的信息。 |
10 | 12 |
|
11 | | -## Implementing free variable abstraction |
| 13 | +假设我们有如下 λ 表达式: |
12 | 14 |
|
13 | | -For deBruijn levels, the free variables keep track of a number that says "I am a free variable representing the nth bound variable *from the top of the telescope*". |
| 15 | +```lean |
| 16 | +(fun (x : A) => <body>) |
| 17 | +``` |
| 18 | + |
| 19 | +我们需要对这个表达式进行类型推断时,必须遍历到其中的 `<body>` 部分,而该部分可能包含引用到绑定变量 `x` 的子表达式。 |
| 20 | + |
| 21 | +当我们进入 `<body>` 时,有两种方式: |
| 22 | + |
| 23 | +1. 将绑定变量 `x` 添加到某个上下文(context)中,并带着整个状态化的上下文进入 `<body>`。 |
| 24 | +2. 临时用 **自由变量** 替换掉所有引用 `x` 的绑定变量,这样我们无需携带额外的上下文即可进入 `<body>`。 |
| 25 | + |
| 26 | +如果随后我们回到最初“打开”绑定器之前的位置,**抽象化** 的过程将允许我们再次用绑定变量(并赋予正确的 deBruijn 索引)替换掉那些临时的自由变量,从而重新构造出正确的闭合表达式。 |
14 | 27 |
|
15 | | -This is the opposite of a deBruijn index, which is a number indicating "the nth bound variable from the bottom of the telescope". |
| 28 | +## 自由变量的抽象化实现细节 |
16 | 29 |
|
17 | | -Top and bottom here refer to visualizing the expression's telescope as a tree: |
| 30 | +对于 deBruijn 表示法中的“自由变量”,我们用 deBruijn 层级来记录: |
18 | 31 |
|
| 32 | +* 自由变量自身记录的信息是:“我是一个表示 **从望远镜顶部开始数的第 n 个绑定变量** 的自由变量”。 |
| 33 | + |
| 34 | +这种方法正好与 deBruijn 索引相反:它表示的是从望远镜的 **底部** 往上数的第 n 个绑定变量。 |
| 35 | + |
| 36 | +在这里,“顶部”和“底部”指的是将表达式中的绑定器序列看作一棵树时的上下位置: |
| 37 | + |
| 38 | +```lean |
| 39 | + fun |
| 40 | + / \ |
| 41 | + a fun |
| 42 | + / \ |
| 43 | + b ... |
| 44 | + \ |
| 45 | + fun |
| 46 | + / \ |
| 47 | + e bvar(0) |
19 | 48 | ``` |
20 | | - fun |
21 | | - / \ |
22 | | - a fun |
23 | | - / \ |
24 | | - b ... |
25 | | - \ |
26 | | - fun |
27 | | - / \ |
28 | | - e bvar(0) |
| 49 | + |
| 50 | +例如,表达式: |
| 51 | + |
| 52 | +```lean |
| 53 | +fun (a b c d e) => bvar(0) |
29 | 54 | ``` |
30 | 55 |
|
31 | | -For example, with a lambda `fun (a b c d e) => bvar(0)`, the bound variable refers to `e`, by referencing "the 0th from the bottom". |
| 56 | +这里的 `bvar(0)` 是从底部数的第 0 个绑定变量,即引用的是 `e`。 |
| 57 | + |
| 58 | +再例如: |
| 59 | + |
| 60 | +```lean |
| 61 | +fun (a b c d e) => fvar(4) |
| 62 | +``` |
| 63 | + |
| 64 | +这里的 `fvar(4)` 是从顶部数的第 4 个绑定变量,同样引用的是 `e`,但这次表达方式不同,是从 **顶部往下计数**。 |
| 65 | + |
| 66 | +## 为什么要区分这两种表示法? |
| 67 | + |
| 68 | +在强归约(strong reduction)过程中,当我们创建自由变量时,我们能知道一些信息: |
| 69 | + |
| 70 | +* 我们知道要替换进来的自由变量可能会因为后续的归约而被重新移动位置; |
| 71 | +* 我们明确知道目前 **在我们之上的绑定器数量**(因为我们进入当前表达式时访问过它们); |
| 72 | +* 我们知道稍后可能需要抽象化这个表达式,以便重新闭合开放的绑定器,这时我们必须重新绑定自由变量。 |
| 73 | + |
| 74 | +但在创建自由变量的那个时刻,我们却 **无法得知当前下方还剩余多少绑定器**。也就是说,我们暂时无法预见这个自由变量在最后抽象化后具体会处于“自底部起的第几个”绑定变量的位置上。 |
| 75 | + |
| 76 | +因此,用 deBruijn 层级标记自由变量(即自顶部计数)更为便捷。 |
32 | 77 |
|
33 | | -In the lambda expression `fun (a b c d e) => fvar(4)`, the free variable is a deBruijn level representing `e` again, but this time as "the 4th from the top of the telescope". |
| 78 | +## 如何具体实现抽象化? |
34 | 79 |
|
35 | | -Why the distinction? When we create a free variable during strong reduction, we know a couple of things: we know that the free variable we're about to sub in might get moved around by further reduction, we know how many open binder are *ABOVE* us (because we had to visit them to get here), and we know we might need to quote/abstract this expression to replace the binders, meaning we need to re-bind the free variable. However, in that moment, we do NOT know how many binders remain below us, so we cannot say how many variables from the bottom that variable might be when it's eventually abstracted/quoted. |
| 80 | +如果具体实现中使用唯一标识符为自由变量标记位置,那么在抽象化过程中,只需具备以下信息即可: |
36 | 81 |
|
37 | | -For implementations using unique identifiers to tag free variables, this problem is solved by having the actual telescope that's being reconstructed during abstraction. As long as you have the expression and a list of the uniquely-tagged free variables, you can abstract, because the position of the free variables within the list indicates their binder position. |
| 82 | +* 被还原的表达式; |
| 83 | +* 唯一标记的自由变量列表; |
38 | 84 |
|
| 85 | +即可完成抽象化。因为 **自由变量在列表中的位置** 正好明确表明了这些自由变量对应的绑定变量位置。 |
0 commit comments