Skip to content

Commit 909b013

Browse files
committed
1 & 2
1 parent a1995eb commit 909b013

File tree

4 files changed

+33
-163
lines changed

4 files changed

+33
-163
lines changed

src/trust/adversarial_inputs.md

Lines changed: 3 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,13 @@
11
# 对抗性输入
22

3-
在讨论“可信度”这一更宏观的问题时,人们常会提到 Lean 在面对**对抗性输入**时的稳健性。
3+
在讨论“可信度”这一更宏观的问题时,人们常会提到 Lean 在面对 **对抗性输入** 时的稳健性。
44

55
一个正确实现的类型检查器,会把它接收到的输入严格限制在 Lean 类型系统的规则之内,并尊重操作者允许使用的公理。如果操作者只准许 Lean 的三条“官方”公理(`propext``Quot.sound``Classical.choice`),那么无论在何种情况下,输入文件都不应能向类型检查器提供 Prelude 中 `False` 的证明。
66

7-
然而,一个**最小化**的类型检查器并不会主动防御那些“在逻辑上正确,却意在欺骗人工审阅者”的输入。举例来说,攻击者可能**重新定义**他们确信审稿人不会查看的深层依赖,或插入“Unicode 同形异义字符”,使得美化打印器的输出隐藏了对关键定义的微妙篡改。
7+
然而,一个最小化的类型检查器并不会主动防御那些“在逻辑上正确,却意在欺骗人工审阅者”的输入。举例来说,攻击者可能 **重新定义** 他们确信审稿人不会查看的深层依赖,或插入“Unicode 同形异义字符”,使得美化打印器的输出隐藏了对关键定义的微妙篡改。
88

99
“用户以为某定理已被形式化证明,实际上却被系统的行为误导”这一风险,被称为 **Pollack 不一致性(Pollack-inconsistency)**,Freek Wiedijk 在其论文中对此进行了探讨 [^pollack]
1010

1111
从原理上讲,开发者完全可以编写软件,或扩展类型检查器,以抵御这类攻击——只是这些防护并不属于“内核所需的最小功能”。然而,Lean 用户对其强大的自定义语法与宏系统的广泛使用,的确给改进此方面的防护带来了一定挑战。对此,读者可将其视为一种[未来工作的开放议题](../future_work.md#improving-pollack-consistency)
1212

13-
[^pollack]: Freek Wiedijk, “Pollack-inconsistency”, *Electronic Notes in Theoretical Computer Science* 285 (2012): 85-100.
14-
15-
16-
# Adversarial inputs
17-
18-
A topic that often accompanies the more general trust question is Lean's robustness against adversarial inputs.
19-
20-
A correct type checker will restrict the input it receives to the rules of Lean's type system under whatever axioms the operator allows. If the operator restricts the permitted axioms to the three "official" ones (`propext`, `Quot.sound`, `Classical.choice`), an input file should not be able to offer a proof of the prelude's `False` which is accepted by the type checker under any circumstances.
21-
22-
However, a minimal type checker will not actively protect against inputs which provide Lean declarations that are logically sound, but are designed to fool a human operator. For example, redefining deep dependencies an adversary knows will not be examined by a referee, or introducing unicode lookalikes to produce a pretty printer output that conceals modification of key definitions.
23-
24-
The idea that "a user might think a theorem has been formally proved, while in fact he or she
25-
is misled about what it is that the system has actually done" is addressed by the idea of Pollack consistency and is explored in this publication[^pollack] by Freek Wiedijk.
26-
27-
Note that there is nothing in principle preventing developers from writing software or extending a type checker to provide protection against such attacks, it's just not captured by the minimal functionality required by the kernel. However, the extent to which Lean's users have embraced its powerful custom syntax and macro systems may pose some challenges for those interested in improving the story here. Readers should consider this somewhat of an [open issue for future work](../future_work.md#improving-pollack-consistency)
28-
29-
[^pollack]: Freek Wiedijk. Pollack-inconsistency. Electronic Notes in Theoretical Computer Science, 285:85–100, 2012
13+
[^pollack]: Freek Wiedijk. Pollack-inconsistency. Electronic Notes in Theoretical Computer Science, 285:85–100, 2012

src/trust/trust.md

Lines changed: 16 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -1,90 +1,28 @@
11
# 信任
22

3-
Lean 的核心价值之一在于它能够构建数学证明,包括关于程序正确性的证明。用户经常提出的一个问题是:信任 Lean 究竟需要多大程度的信任,以及具体需要信任哪些部分。
3+
Lean 的核心价值之一在于它能够构建数学证明,包括关于程序正确性的证明。用户经常提出的一个问题是:我们能够多大程度地信任 Lean,以及具体需要信任哪些部分。
44

5-
这个问题的答案包含两个方面:用户需要信任哪些部分才能相信Lean中的证明,以及用户需要信任哪些部分才能相信通过编译Lean程序获得的可执行程序
5+
这个问题的答案包含两个方面:用户需要信任哪些部分才能相信 Lean 中的证明,以及用户需要信任哪些部分才能相信通过编译 Lean 程序获得的可执行程序
66

7-
具体来说,区别在于:证明(包括关于程序的陈述)和未编译的程序可以直接用Lean的内核语言表达,并由内核对实现进行检查。它们不需要被编译成可执行文件,因此信任仅限于检查它们的内核实现,而Lean编译器不属于可信代码库的一部分
7+
具体来说,区别在于:证明(包括关于程序的陈述)和未编译的程序可以直接用 Lean 的内核语言表达,并由内核对实现进行检查。它们不需要被编译成可执行文件,因此信任仅限于检查它们的内核实现,暂时不用信任 Lean 编译器
88

9-
信任已编译Lean程序的正确性需要信任Lean的编译器,而编译器与内核是分离的,不属于Lean的核心逻辑。信任Lean中_关于程序的陈述_与信任_Lean编译器生成的程序_是两回事。关于Lean程序的陈述是证明,属于仅需信任内核的范畴。而信任关于程序的证明_能推广到已编译程序的行为_则会将编译器纳入可信代码库
9+
信任已编译 Lean 程序的正确性需要信任 Lean 的编译器,而编译器与内核是分离的,不属于 Lean 的核心逻辑。信任 Lean 中 _关于程序的陈述_ 与信任 _Lean 编译器生成的程序_ 是两回事。关于 Lean 程序的陈述是证明,属于仅需信任内核的范畴。而信任关于程序的证明 _能推广到已编译程序的行为_ 则会将编译器纳入可信代码库
1010

11-
**注意**:策略(tactics)和其他元程序(metaprograms),即使是已编译的策略,也_完全不需要_被信任;它们是非可信代码,仅用于生成供其他部分使用的内核项。命题`P`可以通过任意复杂的已编译元程序在Lean中证明,而无需将可信代码库扩展到内核之外,因为元程序必须生成用Lean内核语言表达的证明
11+
**注意**:策略(tactics)和其他元程序(metaprograms),即使是已编译的策略,_完全不需要_ 被信任;它们是非可信代码,仅用于生成供其他部分使用的内核项。命题 `P` 可以通过任意复杂的已编译元程序在 Lean 中证明,而无需将可信代码库扩展到内核之外,因为元程序必须生成用 Lean 内核语言表达的证明
1212

13-
+ 这些陈述适用于[导出](../export_format.md)的证明。为了让更~~挑剔~~谨慎的读者满意,这确实需要在某种程度上信任其他部分,例如运行导出器和验证器的计算机操作系统、硬件等。
13+
+ 这些陈述适用于[导出](../export_format.md)的证明。为了让更 ~~挑剔~~ 谨慎的读者满意,这确实需要在某种程度上信任其他部分,例如运行导出器和验证器的计算机操作系统、硬件等。
1414

15-
+ 对于未导出的证明,用户还需要额外信任内核之外的Lean组件(如 elaborator、解析器等)。
16-
17-
# Trust
18-
19-
A big part of Lean's value proposition is the ability to construct mathematical proofs, including proofs about program correctness. A common question from users is how much trust, and in what exactly, is involved in trusting Lean.
20-
21-
An answer to this question has two parts: what users need to trust in order to trust proofs in Lean, and what users need to trust in order to trust executable programs obtained by compiling a Lean program.
22-
23-
Concretely, the distinction is that proofs (which includes statements about programs) and uncompiled programs can be expressed directly in Lean's kernel language and checked by an implementation of the kernel. They do not need to be compiled to an executable, therefore the trust is limited to whatever implementation of the kernel they're being checked with, and the Lean compiler does not become part of the trusted code base.
24-
25-
Trusting the correctness of compiled Lean programs requires trust in Lean's compiler, which is separate from the kernel and is not part of Lean's core logic. There is a distinction between trusting _statements about programs_ in Lean, and trusting _programs produced by the Lean compiler_. Statements about Lean programs are proofs, and fall into the category that only requires trust in the kernel. Trusting that proofs about a program _extend to the behavior of a compiled program_ brings the compiler into the trusted code base.
26-
27-
**NOTE**: Tactics and other metaprograms, even tactics that are compiled, do *not* need to be trusted _at all_; they are untrusted code which is used to produce kernel terms for use by something else. A proposition `P` can be proved in Lean using an arbitrarily complex compiled metaprogram without expanding the trusted code base beyond the kernel, because the metaprogram is required to produce a proof expressed in Lean's kernel language.
28-
29-
+ These statements hold for proofs that are [exported](../export_format.md). To satisfy more ~~pedantic~~ vigilant readers, this does necessarily entail some degree of trust in, for example, the operating system on the computer used to run the exporter and verifier, the hardware, etc.
30-
31-
+ For proofs that are not exported, users are additionally trusting the elements of Lean outside the kernel (the elaborator, parser, etc.).
32-
33-
## A more itemized list
34-
35-
A more itemized description of the trust involved in Lean 4 comes from a post by Mario Carneiro on the Lean Zulip.
36-
37-
> In general:
38-
>
39-
> 1. You trust that the lean logic is sound (author's note: this would include any kernel extensions, like those for Nat and String)
40-
>
41-
> 2. If you didn't prove the program correct, you trust that the elaborator has converted your input into the lean expression denoting the program you expect.
42-
>
43-
> 3. If you did prove the program correct, you trust that the proofs about the program have been checked (use external checkers to eliminate this)
44-
>
45-
> 4. You trust that the hardware / firmware / OS software running all of these things didn't break or lie to you
46-
>
47-
> 5. (When running the program) You trust that the hardware / firmware / OS software faithfully executes the program according to spec and there are no debuggers or magnets on the hard drive or cosmic rays messing with your output
48-
>
49-
> For compiled executables:
50-
>
51-
> 6. You trust that any compiler overrides (extern / implemented_by) do not violate the lean logic (i.e. the model matches the implementation)
52-
>
53-
> 7. You trust the lean compiler (which lowered the lean code to C) to preserve the semantics of the program
54-
>
55-
> 8. You trust clang / LLVM to convert the C program into an executable with the same semantics
56-
57-
The first set of points applies to both proofs and compiled executables, while the second set applies specifically to compiled executable programs.
58-
59-
## Trust for external checkers
60-
61-
1. You're still trusting Lean's logic is sound.
62-
63-
2. You're trusting that the developers of the external checker properly implemented the program.
64-
65-
3. You're trusting the implementing language's compiler or interpreter. If you run multiple external checkers, you can think of them as circles in a venn diagram; you're trusting that the part where the circles intersect is free of soundness issues.
66-
67-
4. For the Nat and String kernel extensions, you're probably trusting a bignum library and the UTF-8 string type of the implementing language.
68-
69-
The advantages of using external checkers are:
70-
71-
+ Users can check their results with something that is completely disjoint from the Lean ecosystem, and is not dependent on any parts of Lean's code base.
72-
73-
+ External checkers can be written to take advantage of mature compilers or interpreters.
74-
75-
+ For kernel extensions, users can cross-check the results of multiple bignum/string implementations.
76-
77-
+ Using the export feature is the only way to get out of trusting the parts of Lean outside the kernel, so there's a benefit to doing this even if the export file is checked by something like [lean4lean](https://github.com/digama0/lean4lean/tree/master). Users worried about fallout from misuse of Lean's metaprogramming features are therefore encouraged to use the export feature.
15+
+ 对于未导出的证明,用户还需要额外信任内核之外的 Lean 组件(如繁饰器(elaborator)、解析器等)。
7816

7917
## 更详细的清单
8018

81-
关于Lean 4中信任问题的更详细说明来自Mario Carneiro在Lean Zulip上的帖子
19+
关于 Lean 4 中信任问题的更详细说明来自 Mario Carneiro 在 Lean Zulip 上的帖子
8220

8321
> 一般来说:
8422
>
85-
> 1. 你需要信任Lean的逻辑是可靠的(作者注:这包括任何内核扩展,例如Nat和String的扩展
23+
> 1. 你需要信任Lean的逻辑是可靠的(作者注:这包括任何内核扩展,例如 Nat 和 String 的扩展
8624
>
87-
> 2. 如果你没有证明程序的正确性,你需要信任elaborator已将你的输入转换为符合预期的Lean表达式
25+
> 2. 如果你没有证明程序的正确性,你需要信任繁饰器已将你的输入转换为符合预期的 Lean 表达式
8826
>
8927
> 3. 如果你确实证明了程序的正确性,你需要信任关于程序的证明已被检查(可通过外部检查器消除此需求)
9028
>
@@ -94,11 +32,11 @@ The advantages of using external checkers are:
9432
>
9533
> 对于已编译的可执行文件:
9634
>
97-
> 6. 你需要信任任何编译器覆盖(extern / implemented_by)没有违反Lean逻辑(即模型与实现匹配)
35+
> 6. 你需要信任任何编译器覆盖(extern / implemented_by)没有违反 Lean 逻辑(即模型与实现匹配)
9836
>
99-
> 7. 你需要信任Lean编译器(将Lean代码降级为C代码)能保持程序的语义
37+
> 7. 你需要信任Lean编译器(将 Lean 代码降级为 C 代码)能保持程序的语义
10038
>
101-
> 8. 你需要信任clang/LLVM能将C程序转换为具有相同语义的可执行文件
39+
> 8. 你需要信任 clang/LLVM 能将 C 程序转换为具有相同语义的可执行文件
10240
10341
第一组要点适用于证明和已编译的可执行文件,而第二组专门针对已编译的可执行程序。
10442

@@ -110,14 +48,14 @@ The advantages of using external checkers are:
11048

11149
3. 你需要信任实现语言的编译器或解释器。如果运行多个外部检查器,你可以将它们视为维恩图中的圆圈;你需要信任这些圆圈重叠的部分没有可靠性问题。
11250

113-
4. 对于Nat和String的内核扩展,你可能需要信任一个大数库和实现语言的UTF-8字符串类型
51+
4. 对于 Nat 和 String 的内核扩展,你可能需要信任一个大数库和实现语言的 UTF-8 字符串类型
11452

11553
使用外部检查器的优势包括:
11654

117-
+ 用户可以用完全独立于Lean生态系统的工具检查结果,不依赖于Lean代码库的任何部分
55+
+ 用户可以用完全独立于 Lean 生态系统的工具检查结果,不依赖于 Lean 代码库的任何部分
11856

11957
+ 外部检查器可以利用成熟的编译器或解释器实现。
12058

12159
+ 对于内核扩展,用户可以交叉检查多个大数/字符串实现的结果。
12260

123-
+ 使用导出功能是摆脱对Lean内核之外部分信任的唯一方法,因此即使导出文件是通过[lean4lean](https://github.com/digama0/lean4lean/tree/master)等工具检查的,这样做也有好处。因此,担心滥用Lean元编程功能可能带来影响的用户被鼓励使用导出功能
61+
+ 使用导出功能是摆脱对 Lean 内核之外部分信任的唯一方法,因此即使导出文件是通过 [lean4lean](https://github.com/digama0/lean4lean/tree/master) 等工具检查的,这样做也有好处。因此,担心滥用 Lean 元编程功能可能带来影响的用户被鼓励使用导出功能

0 commit comments

Comments
 (0)