-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Updating Carbon's safety strategy #5914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 20 commits
69d8848
d2057e8
af5f2ed
833b9c7
e966542
285d1ff
8daa3a9
77333ba
8d9298e
6630fa9
0f74238
320aa65
8dff909
5078180
643d290
36a343c
a0c0928
3979665
f84b0dd
0e622eb
2ca2b76
1336eb3
7bee8ed
38c1d49
9abe3be
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,6 +17,8 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | |
| - [Data races versus unsynchronized temporal safety](#data-races-versus-unsynchronized-temporal-safety) | ||
| - [Safe library ecosystem](#safe-library-ecosystem) | ||
| - [Build modes](#build-modes) | ||
| - [Constraints](#constraints) | ||
| - [Probabilistic techniques likely cannot stop attacks](#probabilistic-techniques-likely-cannot-stop-attacks) | ||
| - [References](#references) | ||
|
|
||
| <!-- tocstop --> | ||
|
|
@@ -34,6 +36,7 @@ of code safety such as avoiding undefined behavior in other forms. | |
| [code safety]: | ||
| /docs/design/safety/terminology.md#code-software-or-program-safety | ||
| [systems safety]: /docs/design/safety/terminology.md#safety | ||
| [memory safety]: /docs/design/safety/terminology.md#memory-safety | ||
|
|
||
| However, Carbon also has a goal of fine grained, smooth interop-with and | ||
| migration-from existing C++ code, and C++ is a fundamentally unsafe language. It | ||
|
|
@@ -42,19 +45,19 @@ guarantees. Our safety strategy has to address how C++ code fits into it, and | |
| provide an incremental path from where the code is at today towards increasing | ||
| levels of safety. | ||
|
|
||
| Ultimately, Carbon will both provide a [memory safe language], _and_ provide a | ||
| language that can interop with C++ and be targeted for mechanical migration from | ||
| C++. | ||
| Ultimately, Carbon will both provide a [memory-safe language], _and_ provide a language | ||
| that is a target for mechanical migration from C++ and optimizes even further for | ||
| interop with unsafe C++ with minimal friction. | ||
|
|
||
| [memory safe language]: /docs/design/safety/terminology.md#memory-safe-language | ||
| [memory-safe language]: /docs/design/safety/terminology.md#memory-safe-language | ||
|
|
||
| ## Safe and unsafe code | ||
|
|
||
| Carbon will have both _safe_ and _unsafe_ code. Safe code provides limits on the | ||
| potential behavior of the program even in the face of bugs in order to prevent | ||
| [safety bugs] from becoming [vulnerabilities]. Unsafe code is any code or operation | ||
| which may lack limits or guarantees on behavior, and as a result may result in undefined | ||
| behavior and a safety bug. | ||
| which lacks limits or guarantees on behavior, and as a consequence may have undefined | ||
| behavior or be a safety bug. | ||
|
|
||
| [safety bugs]: /docs/design/safety/terminology.md#safety-bugs | ||
| [vulnerabilities]: | ||
|
|
@@ -73,8 +76,19 @@ language: | |
| whatever syntax is used so that this keyword can be used as a common | ||
| annotation in audits and review. | ||
|
|
||
| The result is that Carbon does not have a "safe" mode and an "unsafe" mode, but | ||
| rather narrow and specific unsafe operations and constructs. | ||
| The result is that we don't model large regions or sections of Carbon code as | ||
| "unsafe" or have a parallel "unsafe" variant language. We instead focus on the | ||
| narrow and specific unsafe operations and constructs. | ||
|
|
||
| Note that when we're talking about the narrow semantics of an unsafe capability, | ||
| these are the semantics of the specific unsafe operation. For example, an unsafe | ||
| type conversion shouldn't also allow unsafe out-of-lifetime access. This is | ||
| separate from the _soundness_ implications of an unsafe operation, which may not | ||
| be as easily narrowed. | ||
|
|
||
| > **Future work**: More fully expand on the soundness principles and model for | ||
| > safe Carbon code. This is an interesting and important area of the design that | ||
| > isn't currently fleshed out in detail. | ||
|
|
||
| ## Safety modes | ||
|
|
||
|
|
@@ -86,11 +100,29 @@ keyword. This mode is designed to satisfy the requirements of a [memory | |
| safe language]. | ||
|
|
||
| _Permissive Carbon_ is a mode optimized for interop with C++ and automated | ||
| migration from C++. In this mode, some unsafe code to not require an `unsafe` | ||
| migration from C++. In this mode, some unsafe code does not require an `unsafe` | ||
| keyword: specific aspects of C++ interop or pervasive patterns that occur when | ||
| migrating from C++. However, not _all_ unsafe code will omit the keyword, the | ||
| permissive mode is designed to be minimal in the unsafe code allowed. | ||
chandlerc marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Modes can be configured on an individual file as part of the package | ||
| declaration, or an a function body as part of the function definition. More | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
"an" -> "on"? or "in"? |
||
| options such as regions of declarations or regions of statements can be explored | ||
| in the future based on demand in practice when working with mixed-strictness | ||
| code. More fine grained than statements is not expected given that the same core | ||
| expressivity is available at that finer granularity through explicitly marking | ||
| `unsafe` operations. | ||
|
|
||
| > **Future work**: Define the exact syntax for package declaration and function | ||
| > definition control of strictness. The syntax for enabling the permissive mode | ||
| > must include the `unsafe` keyword as it is standing in place of more granular | ||
| > use of the `unsafe` keywords and needs to still be discovered when auditing | ||
| > for safety. | ||
|
|
||
| > **Future work**: Define how to mark `import`s of permissive libraries in | ||
| > various contexts, balancing ergonomic burden against the potential for | ||
| > surprising amounts af unsafety in dependencies. | ||
|
|
||
| ## Memory safety model | ||
|
|
||
| Carbon will use a hybrid of different techniques to achieve memory safety in its | ||
|
|
@@ -123,10 +155,13 @@ how they approach both temporal and data-race safety. | |
|
|
||
| Carbon has the option of distinguishing between two similar but importantly | ||
| different classes of bugs: data races and unsynchronized temporal safety | ||
| violations. Specifically, there evidence from security teams that the second of | ||
| these has been a greater source of exploitation than first. As a consequence, | ||
| Carbon has some flexibility while still being a [memory safe language] according | ||
| to our definition: | ||
| violations. Specifically, there is no evidence from security teams that there is | ||
| any significant volume of vulnerabilities that involve a data race bug but don't | ||
| also involve a temporal memory safety violation. For example, despite both Go | ||
| and non-strict-concurrency Swift only providing temporal safety, the rate of | ||
| memory safety vulnerabilities in software written in both matches the expected | ||
| low rate for memory-safe languages. As a consequence, Carbon has some | ||
| flexibility while still being a [memory-safe language] according to our definition: | ||
|
|
||
| - Carbon might choose to _not_ prevent data race bugs that are not | ||
| _themselves_ also temporal safety bugs, even though the data race may lead | ||
|
|
@@ -137,25 +172,36 @@ to our definition: | |
| synchronization directly allows temporal safety bugs, such as use after | ||
| free. | ||
|
|
||
| However, preventing these data race bugs remains _highly valuable_ for | ||
| correctness, debugging, and achieving [fearless concurrency]. But it is one aspect | ||
| where Carbon can in theory afford a compromise based on the current security information. | ||
| Despite having this flexibility, preventing data race bugs remains _highly | ||
| valuable_ for correctness, debugging, and achieving [fearless concurrency]. If Carbon | ||
| can, it should work to prevent data races as well. | ||
|
|
||
| [fearless concurrency]: https://doc.rust-lang.org/book/ch16-00-concurrency.html | ||
|
|
||
| ## Safe library ecosystem | ||
chandlerc marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Carbon will need access to a memory safe library ecosystem. The industry is | ||
| currently investing a massive amount of effort to build out a sufficient | ||
| ecosystem of such software using Rust, and it is critical that Carbon does not | ||
| impede, slow down, or require duplicating that ecosystem. | ||
| Carbon will need access to memory-safe library ecosystems, both for | ||
| general-purpose, multi-platform functionality and for platform-specific | ||
| functionality. Currently, the industry is currently investing a massive amount | ||
| of effort to build out a sufficient ecosystem of general, multi-platform | ||
| software using Rust, and it is critical that Carbon does not impede, slow down, | ||
| or require duplicating that ecosystem. Similarly, if any other cross-platform | ||
| library ecosystems emerge in any viable memory-safe languages given our | ||
| performance constraints, we should work to reuse them and avoid duplication. | ||
|
|
||
| **Carbon's strategy for safe and generally reusable cross-platform libraries is | ||
| to leverage Rust libraries through interop.** This is a major motivating reason | ||
| for seamless and safe Rust interop. The Carbon project will work to avoid | ||
| creating duplication between the growing Rust library ecosystem and any future | ||
| Carbon library ecosystem. Carbon's ecosystem will be focused instead on | ||
| libraries and functionality that would either be missing or only available in | ||
| C++. | ||
|
|
||
| **Carbon's strategy for safe and generally reusable libraries is to leverage | ||
| Rust libraries through interop.** This is a major motivating reason for seamless | ||
| and safe Rust interop. The Carbon project will work to avoid creating | ||
| duplication between the growing Rust library ecosystem and any future Carbon | ||
| library ecosystem. Carbon's ecosystem will be focused instead on libraries and | ||
| functionality that would either be missing or only available in C++. | ||
| Platform-specific functionality is typically developed in that platform's native | ||
| language, whether that is Swift for Apple platforms or Kotlin for Android. | ||
| Again, the goal of Carbon should be to avoid duplicating functionality, and | ||
| instead to prioritize high quality interop with the existing platform libraries | ||
| in the relevant languages on those platforms. | ||
|
|
||
| ## Build modes | ||
|
|
||
|
|
@@ -180,8 +226,9 @@ that disables the run-time enforcement, enabling the control of any overhead inc | |
| [evidence]: https://chandlerc.blog/posts/2024/11/story-time-bounds-checking/ | ||
| [performance control]: /docs/project/goals.md#performance-critical-software | ||
|
|
||
| The debug build will work to cause bugs and any detectable undefined behavior to | ||
| have [fail-stop] behavior and even detailed diagnostics to enable better | ||
| The debug build will change the actual behavior of both safe and unsafe code | ||
| with detectable bugs or detectable undefined or erroneous behavior to have | ||
| [fail-stop] behavior and provide detailed diagnostics to enable better | ||
| debugging. This mode will at least provide similar bug [detection] capabilities | ||
| to [AddressSanitizer] and [MemorySanitizer]. | ||
|
|
||
|
|
@@ -209,9 +256,40 @@ release build for that specific purpose. For example: | |
| https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html | ||
|
|
||
| Carbon will provide a way to disable the default hardening in release builds, | ||
| but not in a supported way. It's use is expected to be strictly for benchmarking | ||
| but not in a supported way. Its use is expected to be strictly for benchmarking | ||
| purposes. | ||
|
|
||
| ## Constraints | ||
|
|
||
| ### Probabilistic techniques likely cannot stop attacks | ||
|
|
||
| It's expected that non-cryptographic probabilistic techniques that can be | ||
| applied at the language level are attackable through a variety of techniques: | ||
|
|
||
| - The attacker might be able to attack repeatedly until it gets through. | ||
| - The attacker may be able to determine when the attack would be detected and | ||
| only run the attack when it would not be. | ||
| - The attacker might be able control the test condition to make detection much | ||
| less likely or avoid detection completely. For example, if detection is | ||
| based on the last 4 bits of a memory address, an attacker may be able to | ||
| generate memory allocations, viewing the address and only attacking when | ||
| there's a collision. | ||
|
|
||
| Hardware vulnerabilities may make these attacks easier than they might otherwise | ||
| appear. Future hardware vulnerabilities are difficult to predict. In general, we | ||
| do not expect non-cryptographic probabilistic techniques to be an effective | ||
| approach to achieving safety. | ||
|
|
||
| While _cryptographic_ probabilistic techniques can, and typically do, work | ||
| carefully to not be subject to these weaknesses, they face a very different | ||
| challenge. The overhead of a cryptographically secure hash is generally | ||
| prohibitive for use in language level constructs. Further, some of the defenses | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. language level -> language-level |
||
| against hardware vulnerabilities and improvements further exacerbate these | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. delete "and improvements" |
||
| overheads. However, when these can be applied usefully such as with [PKeys], | ||
| they are robust. | ||
|
|
||
| [PKeys]: https://docs.kernel.org/core-api/protection-keys.html | ||
|
|
||
| ## References | ||
|
|
||
| - Proposal | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.