From c3aec7499d1d132624dd2cb31e8d2d3da5157bb5 Mon Sep 17 00:00:00 2001 From: Johannes Hostert Date: Sun, 17 Aug 2025 17:33:49 +0200 Subject: [PATCH 1/4] Tree Borrows MD --- wip/tree-borrows.md | 63 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) create mode 100644 wip/tree-borrows.md diff --git a/wip/tree-borrows.md b/wip/tree-borrows.md new file mode 100644 index 0000000..70f19c5 --- /dev/null +++ b/wip/tree-borrows.md @@ -0,0 +1,63 @@ +# Tree Borrows + +**Note:** This document is not normative nor endorsed by the UCG WG. It is maintained by @RalfJung and @JoJoDeveloping to reflect what is currently implemented in Miri. + +This is not a guide! See the [Tree Borrows paper](https://plf.inf.ethz.ch/research/pldi25-tree-borrows.html) for more information. + +Changes since publication of the paper: + +* Interior-Mutable shared references are no longer treated like raw pointers, instead they use the new `Cell` permission. This permission allows all foreign and local accesses. +* Mirroring Stacked Borrows, structs which contain an UnsafeCell now have that UnsafeCell's position tracked more finely-grained. It is no longer sufficient to just have an UnsafeCell somewhere in a struct to mark this as being interior-mutable everyhwere. + +## MiniRust + +Tree Borrows is documented in [MiniRust](https://github.com/minirust/minirust/tree/master/spec/mem/tree_borrows). MiniRust is written as literate code and should be readable without further explanation. + +Instead of yet again defining Tree Borrows in prose here, we refer to this. + + +### High-level summary +Tree Borrows maintains a tree for each allocation. Each pointer has a tag, that identifies a node in this tree. +Each node, for each offset/byte in the allocation, tracks a permission. The permission is per-byte, i.e. each byte has its own independent permission. +The permission evolves according to a state machine, which depends on the access (read/write), the relation between accessed and affected node (local/foreign), the current state, and whether the current node is protected by a protector. + +There is also an "initialized" tracking which makes protectors behave different on offsets "out of bounds" of a retag, that have not yet been accessed. +These differences are not reflected in the state machines in the paper, we refer to the MiniRust implementation for the full details. + + +### Differences between MiniRust and Miri +MiniRust includes an idealized implementation of Tree Borrows. In particular, it models provenance/tags as tree addresses, which uniquely identify a node in the borrow tree. Miri however uses unique IDs, with the Tree being tracked more implicitly has relations between these IDs. This is an implementation detail and not relevant for understanding the semantics. +Besides this representation difference, Miri also includes a number of optimizations that make Tree Borrows have acceptable performance. These include +* skipping nodes based on past foreign accesses, exploiting idempotence properties in the state machine +* garbage collection of unused references, which allows shrinking trees +* skipping nodes based on the permissions found there + +## Concepts Inherited From Stacked Borrows + +### Retags +Tree Borrows has retags happen in the same place as Stacked Borrows. But note that Tree Borrows treats raw pointer retags as NOPs, i.e. it does not attempt to track these. + +### Protectors +Like Stacked Borrows, Tree Borrows has protectors. These serve to ensure that references remain live throughout a function. Like Stacked Borrows, Tree Borrows has strong and weak protectors, and it protects the same values the same way. + +### Implicit Reads and Writes +Like Stacked Borrows, Tree Borrows performs implicit accesses as part of retags. Unlike Stacked Borrows, these are always reads, even for `&mut` references. + +Also unlike Stacked Borrows, Tree Borrows has implicit accesses happen on protector end. These can be writes. See the section on "protector end semantics" in the paper for more info. + +### UnsafeCell tracking +Like Stacked Borrows, Tree Borrows tracks where there are UnsafeCells, and treats these offsets differently from other offsets. UnsafeCells are tracked in structs and tuple fields, but enums are not inspected further. + +### Accesses +Besides for the aforementioned differences in the handling of retags, what counted as a read or write in Stacked Borrows also counts as a read or write in Tree Borrows. These places are not surprising. + +## Imprecisions + +The following is a list of things that are _not_ UB in Tree Borrows. Some people want to make these things UB, so that more optimizations become possible. This is currently undecided and might just happen. In particular, all things listed here are already UB in Stacked Borrows. + +* Tree Borrows does _not_ have subobject provenance, meaning that retags do not shrink the set of offsets that a reference can be used to access. +* Tree Borrows does not initially consider `&mut` references writable, it only does so after the first write. In practice, this might mean that optimizations moving writes up above the first write are forbidden. + +## Other problems +* The interaction of protector end writes with the data race model is not fully thought out. +* Finding a good model of exposed provenance in Tree Borrows (that does not use angelic nondeterminism) is an open research question. Until then, Tree Borrows does not support `-Zmiri-permissive-provenance` From 332bc2eefb2d140da0d35f6c80c227a365d8c1d2 Mon Sep 17 00:00:00 2001 From: Johannes Hostert Date: Mon, 18 Aug 2025 22:28:36 +0200 Subject: [PATCH 2/4] refine tree-borrows.md, fixing typos etc. --- wip/tree-borrows.md | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/wip/tree-borrows.md b/wip/tree-borrows.md index 70f19c5..61561d6 100644 --- a/wip/tree-borrows.md +++ b/wip/tree-borrows.md @@ -7,13 +7,13 @@ This is not a guide! See the [Tree Borrows paper](https://plf.inf.ethz.ch/resear Changes since publication of the paper: * Interior-Mutable shared references are no longer treated like raw pointers, instead they use the new `Cell` permission. This permission allows all foreign and local accesses. -* Mirroring Stacked Borrows, structs which contain an UnsafeCell now have that UnsafeCell's position tracked more finely-grained. It is no longer sufficient to just have an UnsafeCell somewhere in a struct to mark this as being interior-mutable everyhwere. +* Mirroring Stacked Borrows, structs which contain an UnsafeCell now have that UnsafeCell's position tracked more finely-grained. It is no longer sufficient to just have an UnsafeCell somewhere in a struct to mark this as being interior-mutable everywhere. ## MiniRust -Tree Borrows is documented in [MiniRust](https://github.com/minirust/minirust/tree/master/spec/mem/tree_borrows). MiniRust is written as literate code and should be readable without further explanation. +Tree Borrows is fully documented in [MiniRust](https://github.com/minirust/minirust/tree/master/spec/mem/tree_borrows). MiniRust is written as literate code and should be readable without further explanation. The MiniRust version of Tree Borrows is the authoritative version, and it will be updated to reflect future changes. MiniRust defines all of Tree Borrows, including the more obscure features. -Instead of yet again defining Tree Borrows in prose here, we refer to this. +Instead of yet again defining Tree Borrows in prose here, we refer to MiniRust. The information below is not normative and only a summary of what is already explained in MiniRust. ### High-level summary @@ -26,11 +26,13 @@ These differences are not reflected in the state machines in the paper, we refer ### Differences between MiniRust and Miri -MiniRust includes an idealized implementation of Tree Borrows. In particular, it models provenance/tags as tree addresses, which uniquely identify a node in the borrow tree. Miri however uses unique IDs, with the Tree being tracked more implicitly has relations between these IDs. This is an implementation detail and not relevant for understanding the semantics. +MiniRust includes an idealized implementation of Tree Borrows, intended for easy readability. + In particular, it models provenance/tags as tree addresses, which uniquely identify a node in the borrow tree. Miri however uses unique integer IDs, with the Tree being tracked more implicitly as maps/relations between these IDs. The precise implementation of the tree is an implementation detail and not relevant for the semantics. + Besides this representation difference, Miri also includes a number of optimizations that make Tree Borrows have acceptable performance. These include * skipping nodes based on past foreign accesses, exploiting idempotence properties in the state machine * garbage collection of unused references, which allows shrinking trees -* skipping nodes based on the permissions found there +* skipping nodes based on the permissions found therein ## Concepts Inherited From Stacked Borrows @@ -38,15 +40,15 @@ Besides this representation difference, Miri also includes a number of optimizat Tree Borrows has retags happen in the same place as Stacked Borrows. But note that Tree Borrows treats raw pointer retags as NOPs, i.e. it does not attempt to track these. ### Protectors -Like Stacked Borrows, Tree Borrows has protectors. These serve to ensure that references remain live throughout a function. Like Stacked Borrows, Tree Borrows has strong and weak protectors, and it protects the same values the same way. +Like Stacked Borrows, Tree Borrows has protectors. These serve to ensure that references remain live throughout a function. Protectors are strong and weak, as in SB, and they protect the same places in the same way. ### Implicit Reads and Writes Like Stacked Borrows, Tree Borrows performs implicit accesses as part of retags. Unlike Stacked Borrows, these are always reads, even for `&mut` references. -Also unlike Stacked Borrows, Tree Borrows has implicit accesses happen on protector end. These can be writes. See the section on "protector end semantics" in the paper for more info. +A new concept in TB are protector end accesses. These can be writes. See the section on "protector end semantics" in the paper for more info. ### UnsafeCell tracking -Like Stacked Borrows, Tree Borrows tracks where there are UnsafeCells, and treats these offsets differently from other offsets. UnsafeCells are tracked in structs and tuple fields, but enums are not inspected further. +Like Stacked Borrows, Tree Borrows tracks where there are UnsafeCells, and treats these bytes differently from other bytes. UnsafeCells are tracked in structs and tuple fields, but enums are not inspected further. ### Accesses Besides for the aforementioned differences in the handling of retags, what counted as a read or write in Stacked Borrows also counts as a read or write in Tree Borrows. These places are not surprising. @@ -60,4 +62,4 @@ The following is a list of things that are _not_ UB in Tree Borrows. Some people ## Other problems * The interaction of protector end writes with the data race model is not fully thought out. -* Finding a good model of exposed provenance in Tree Borrows (that does not use angelic nondeterminism) is an open research question. Until then, Tree Borrows does not support `-Zmiri-permissive-provenance` +* Finding a good model of exposed provenance in Tree Borrows (that does not use angelic nondeterminism) is an open research question. Until then, Tree Borrows does not support `-Zmiri-permissive-provenance`. From 593340c37c5a02774ee51ab08b548ffa97a77025 Mon Sep 17 00:00:00 2001 From: Johannes Hostert Date: Tue, 19 Aug 2025 14:19:36 +0200 Subject: [PATCH 3/4] Apply suggestions from code review Co-authored-by: Ralf Jung --- wip/tree-borrows.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/wip/tree-borrows.md b/wip/tree-borrows.md index 61561d6..d15888c 100644 --- a/wip/tree-borrows.md +++ b/wip/tree-borrows.md @@ -21,15 +21,16 @@ Tree Borrows maintains a tree for each allocation. Each pointer has a tag, that Each node, for each offset/byte in the allocation, tracks a permission. The permission is per-byte, i.e. each byte has its own independent permission. The permission evolves according to a state machine, which depends on the access (read/write), the relation between accessed and affected node (local/foreign), the current state, and whether the current node is protected by a protector. -There is also an "initialized" tracking which makes protectors behave different on offsets "out of bounds" of a retag, that have not yet been accessed. +There is also an "accessed" bit in each node for each byte, tracking whether this byte has already been accessed by a pointer tagged with this node. +This is relevant for protectors, because only "accessed" nodes are being protected. These differences are not reflected in the state machines in the paper, we refer to the MiniRust implementation for the full details. ### Differences between MiniRust and Miri MiniRust includes an idealized implementation of Tree Borrows, intended for easy readability. - In particular, it models provenance/tags as tree addresses, which uniquely identify a node in the borrow tree. Miri however uses unique integer IDs, with the Tree being tracked more implicitly as maps/relations between these IDs. The precise implementation of the tree is an implementation detail and not relevant for the semantics. +In particular, it models provenance/tags as tree addresses, which uniquely identify a node in the borrow tree. Miri however uses unique integer IDs, with the Tree being tracked more implicitly as maps/relations between these IDs. The precise implementation of the tree is an implementation detail and not relevant for the semantics. -Besides this representation difference, Miri also includes a number of optimizations that make Tree Borrows have acceptable performance. These include +Besides this representation difference, Miri also includes a number of optimizations that make Tree Borrows have acceptable performance. These include: * skipping nodes based on past foreign accesses, exploiting idempotence properties in the state machine * garbage collection of unused references, which allows shrinking trees * skipping nodes based on the permissions found therein @@ -37,15 +38,15 @@ Besides this representation difference, Miri also includes a number of optimizat ## Concepts Inherited From Stacked Borrows ### Retags -Tree Borrows has retags happen in the same place as Stacked Borrows. But note that Tree Borrows treats raw pointer retags as NOPs, i.e. it does not attempt to track these. +Tree Borrows has retags happen in the same place as Stacked Borrows. But note that Tree Borrows treats raw pointer retags as NOPs, i.e. it does not distinguish a raw pointer from the references it is derived from. ### Protectors -Like Stacked Borrows, Tree Borrows has protectors. These serve to ensure that references remain live throughout a function. Protectors are strong and weak, as in SB, and they protect the same places in the same way. +Like Stacked Borrows, Tree Borrows has protectors. These serve to ensure that references remain live throughout a function. Protectors come in "strong" and "weak" forms, as in SB, and they protect the same places in the same way. ### Implicit Reads and Writes Like Stacked Borrows, Tree Borrows performs implicit accesses as part of retags. Unlike Stacked Borrows, these are always reads, even for `&mut` references. -A new concept in TB are protector end accesses. These can be writes. See the section on "protector end semantics" in the paper for more info. +A new concept in TB are implicit protector end accesses. These can be writes. See the section on "protector end semantics" in the paper for more info. ### UnsafeCell tracking Like Stacked Borrows, Tree Borrows tracks where there are UnsafeCells, and treats these bytes differently from other bytes. UnsafeCells are tracked in structs and tuple fields, but enums are not inspected further. @@ -61,5 +62,5 @@ The following is a list of things that are _not_ UB in Tree Borrows. Some people * Tree Borrows does not initially consider `&mut` references writable, it only does so after the first write. In practice, this might mean that optimizations moving writes up above the first write are forbidden. ## Other problems -* The interaction of protector end writes with the data race model is not fully thought out. +* The interaction of protector end writes with the data race model is not fully resolved. * Finding a good model of exposed provenance in Tree Borrows (that does not use angelic nondeterminism) is an open research question. Until then, Tree Borrows does not support `-Zmiri-permissive-provenance`. From f7061bb1372534c68ce3c8bd7f349a1934e80477 Mon Sep 17 00:00:00 2001 From: Johannes Hostert Date: Tue, 19 Aug 2025 14:20:29 +0200 Subject: [PATCH 4/4] line breaks --- wip/tree-borrows.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/wip/tree-borrows.md b/wip/tree-borrows.md index d15888c..afc86be 100644 --- a/wip/tree-borrows.md +++ b/wip/tree-borrows.md @@ -17,6 +17,7 @@ Instead of yet again defining Tree Borrows in prose here, we refer to MiniRust. ### High-level summary + Tree Borrows maintains a tree for each allocation. Each pointer has a tag, that identifies a node in this tree. Each node, for each offset/byte in the allocation, tracks a permission. The permission is per-byte, i.e. each byte has its own independent permission. The permission evolves according to a state machine, which depends on the access (read/write), the relation between accessed and affected node (local/foreign), the current state, and whether the current node is protected by a protector. @@ -27,6 +28,7 @@ These differences are not reflected in the state machines in the paper, we refer ### Differences between MiniRust and Miri + MiniRust includes an idealized implementation of Tree Borrows, intended for easy readability. In particular, it models provenance/tags as tree addresses, which uniquely identify a node in the borrow tree. Miri however uses unique integer IDs, with the Tree being tracked more implicitly as maps/relations between these IDs. The precise implementation of the tree is an implementation detail and not relevant for the semantics. @@ -38,20 +40,25 @@ Besides this representation difference, Miri also includes a number of optimizat ## Concepts Inherited From Stacked Borrows ### Retags + Tree Borrows has retags happen in the same place as Stacked Borrows. But note that Tree Borrows treats raw pointer retags as NOPs, i.e. it does not distinguish a raw pointer from the references it is derived from. ### Protectors + Like Stacked Borrows, Tree Borrows has protectors. These serve to ensure that references remain live throughout a function. Protectors come in "strong" and "weak" forms, as in SB, and they protect the same places in the same way. ### Implicit Reads and Writes + Like Stacked Borrows, Tree Borrows performs implicit accesses as part of retags. Unlike Stacked Borrows, these are always reads, even for `&mut` references. A new concept in TB are implicit protector end accesses. These can be writes. See the section on "protector end semantics" in the paper for more info. ### UnsafeCell tracking + Like Stacked Borrows, Tree Borrows tracks where there are UnsafeCells, and treats these bytes differently from other bytes. UnsafeCells are tracked in structs and tuple fields, but enums are not inspected further. ### Accesses + Besides for the aforementioned differences in the handling of retags, what counted as a read or write in Stacked Borrows also counts as a read or write in Tree Borrows. These places are not surprising. ## Imprecisions @@ -62,5 +69,6 @@ The following is a list of things that are _not_ UB in Tree Borrows. Some people * Tree Borrows does not initially consider `&mut` references writable, it only does so after the first write. In practice, this might mean that optimizations moving writes up above the first write are forbidden. ## Other problems + * The interaction of protector end writes with the data race model is not fully resolved. * Finding a good model of exposed provenance in Tree Borrows (that does not use angelic nondeterminism) is an open research question. Until then, Tree Borrows does not support `-Zmiri-permissive-provenance`.