Skip to content

Commit 900ee93

Browse files
committed
Attempt to define certain representations in less confusing ways.
1 parent 9349cd3 commit 900ee93

File tree

10 files changed

+54
-25
lines changed

10 files changed

+54
-25
lines changed

src/memory-model.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ A sequence of bytes is said to represent a value of a type, if the decode operat
5151

5252
> [!NOTE]
5353
> Representation is related to, but is not the same property as, the layout of the type.
54+
> A type has a unique representation when each value is represented by exactly one byte sequence. Most primitive types have unique representations.
5455
5556
r[memory.encoding.symmetric]
5657
The result of encoding a given value of a type is a sequence of bytes that represents that value.

src/types/array.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ always bounds-checked in safe methods and operators.
3434
r[type.array.repr]
3535
An array value is represented by each element in ascending index order, placed immediately adjacent in memory.
3636

37+
3738
[_Expression_]: ../expressions.md
3839
[_Type_]: ../types.md#type-expressions
3940
[`usize`]: numeric.md#machine-dependent-integer-types

src/types/boolean.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ r[type.bool.layout]
2121
An object with the boolean type has a [size and alignment] of 1 each.
2222

2323
r[type.bool.repr]
24-
A `bool` is represented as a single initialized byte with a value of `0x00` corresponding to `false` and a value of `0x01` corresponding to `true`. This byte does not have a pointer fragment.
24+
A `bool` is represented as a single initialized byte with a value of `0x00` corresponding to `false` and a value of `0x01` corresponding to `true`.
2525

2626
> [!NOTE]
2727
> No other representations are valid for `bool`. Undefined Behaviour occurs when any other byte is read as type `bool`.

src/types/enum.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,14 +28,13 @@ An enum value corresponds to exactly one variant of the enum, and consists of th
2828
> [!NOTE]
2929
> An enum with no variants therefore has no values.
3030
31-
r[type.enum.value.variant-padding]
32-
A byte is a padding byte in a variant `V` if the byte is not used for computing the discriminant, and the byte would be a padding byte in a struct consisting of the fields of the variant at the same offsets.
33-
3431
r[type.enum.value.value-padding]
35-
A byte is a padding byte of an enum if it is a padding byte in each variant of the enum. A byte that is not a padding byte of an enum is a value byte.
32+
A byte is a [padding][type.union.value.padding] byte of an enum if that byte is not part of the representation of the discriminant of the enum, and in each variant it either:
33+
* Does not overlap with a field of the variant, or
34+
* Overlaps with a padding byte in a field of that variant.
3635

3736
r[type.enum.value.repr]
38-
The representation of a value of an enum type includes the representation of each field of the variant at the appropriate offsets. When encoding a value of an enum type, each byte which is a padding byte in the variant is set to uninit. In the case of a [`repr(C)`][layout.repr.c.adt] or a [primitive-repr][layout.repr.primitive.adt] enum, the discriminant of the variant is represented as though by the appropriate integer type stored at offset 0.
37+
The representation of a value of an enum type includes the representation of each field of the variant at the appropriate offsets. When encoding a value of an enum type, each byte which is not use d to store a field of the variant or the discriminant is . In the case of a [`repr(C)`][layout.repr.c.adt] or a [primitive-repr][layout.repr.primitive.adt] enum, the discriminant of the variant is represented as though by the appropriate integer type stored at offset 0.
3938

4039
> [!NOTE]
4140
> Most `repr(Rust)` enums will also store a discriminant in the representation of the enum, but the exact placement or type of the discriminant is unspecified, as is the value that represents each variant.

src/types/numeric.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,8 @@ r[type.numeric.repr.integer-width]
7070
The range of values an integer type can represent depends on its signedness and its width, in bits. The width of type `uN` or `iN` is `N`. The width of type `usize` or `isize` is the value of the `target_pointer_width` property.
7171

7272
> [!NOTE]
73-
> There are exactly `1<<N` unique values of an integer type of width `N`.
73+
> There are exactly `1<<N` unique values of an integer type of width `N`.
74+
> In particular, for an unsigned type, these values are in the range `0..(1<<N)` and for a signed type, are in the range `-(1<<(N-1))..(1<<(N-1))`, using rust range syntax.
7475
7576
r[type.numeric.repr.unsigned]
7677
A value `i` of an unsigned integer type `U` is represented by a sequence of initialized bytes, where the `m`th successive byte according to the byte order of the platform is `(i >> (m*8)) as u8`, where `m` is between `0` and the size of `U`. None of the bytes produced by encoding an unsigned integer has a pointer fragment.
@@ -82,13 +83,26 @@ A value `i` of an unsigned integer type `U` is represented by a sequence of init
8283
> [!WARN]
8384
> On `little` endian, the order of bytes used to decode an integer type is the same as the natural order of a `u8` array - that is, the `m` value corresponds with the `m` index into a same-sized `u8` array. On `big` endian, however, the order is the opposite of this order - that is, the `m` value corresponds with the `size_of::<T>() - m` index in that array.
8485
86+
8587
r[type.numeric.repr.signed]
8688
A value `i` of a signed integer type with width `N` is represented the same as the corresponding value of the unsigned counterpart type which is congruent modulo `2^N`.
8789

90+
> [!NOTE]
91+
> This encoding of signed integers is known as the 2s complement encoding.
92+
93+
r[type.numeric.repr.float-width]
94+
Each floating-point type has a width. The type `fN` has a width of `N`.
95+
8896
r[type.numeric.repr.float]
89-
A floating-point value is represented the same as a value of the unsigned integer type with the same width given by its [IEEE 754-2019] encoding.
97+
A floating-point value is represented by the following decoding:
98+
* The byte sequence is decoded as the unsigned integer type with the same width as the floating-point type,
99+
* The resulting integer is decoded according to [IEEE 754-2019] into the format used for the type.
100+
101+
> [!NOTE]
102+
> The representation of each finite number and infinity is unique as a result of this.
103+
> The exact behaviour of encoding and decoding NaNs is not yet decided
90104
91105
r[type.numeric.repr.float-format]
92-
The [IEEE 754-2019] `binary32` format is used for `f32`, and the `binary64` format is used for `f64`.
106+
The [IEEE 754-2019] `binary32` format is used for `f32`, and the `binary64` format is used for `f64`. The set of values for each floating-point type are determined by the respective format.
93107

94108
[IEEE 754-2019]: https://ieeexplore.ieee.org/document/8766229

src/types/pointer.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,11 +106,12 @@ A wide pointer or reference consists of a data pointer or reference, and a point
106106
r[type.pointer.value.wide-reference]
107107
The data pointer of a wide reference has a non-null address, well aligned for `align_of_val(self)`, and with provenance for `size_of_val(self)` bytes.
108108

109-
r[type.pointer.value.wide-representation]
109+
r[type.pointer.value.wide-repr]
110110
A wide pointer or reference is represented the same as `struct WidePointer<M>{data: *mut (), metadata: M}` where `M` is the pointee metadata type, and the `data` and `metadata` fields are the corresponding parts of the pointer.
111111

112112
> [!NOTE]
113113
> The `WidePointer` struct has no guarantees about layout, and has the default representation.
114+
> In particular, it is not guaranteed that you can write a struct type with the same layout as `WidePointer<M>`.
114115
115116
## Pointer Provenance
116117

src/types/struct.md

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -36,20 +36,6 @@ r[type.struct.value]
3636
r[type.struct.value.intro]
3737
A value of a struct type consists of a list of values for each field.
3838

39-
r[type.struct.value.value-bytes]
40-
A byte `b` in the representation of an aggregate is a value byte if there exists a field of that aggregate such that:
41-
* The field has some type `T`,
42-
* The offset of that field `o` is such that `b` falls at an offset in `o..(o+size_of::<T>())`,
43-
* Either `T` is a primitive type or the offset of `b` within the field is a value byte in the representation of `T`.
44-
45-
> [!NOTE]
46-
> A byte in a union is a value byte if it is a value byte in *any* field.
47-
48-
r[type.struct.value.padding]
49-
Every byte in an aggregate which is not a value byte is a padding byte.
50-
51-
> [!NOTE]
52-
> Enum types can also have padding bytes.
5339

5440
r[type.struct.value.encode-decode]
5541
When a value of a struct type is encoded, each field of the struct is encoded at its corresponding offset and each byte that is not within a field of the struct is set to uninit.

src/types/textual.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ or 0xE000 to 0x10FFFF range.
1818
r[type.text.char-repr]
1919
A value of type `char` is represented as the value of type `u32` with value equal to the code point that it represents.
2020

21+
> [!NOTE]
22+
> The representation of `char` is unique.
23+
2124
r[type.text.str-value]
2225
A value of type `str` is represented the same way as `[u8]`, a slice of
2326
8-bit unsigned bytes. However, the Rust standard library makes extra assumptions

src/types/tuple.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,11 @@ Tuple fields can be accessed by either a [tuple index expression] or [pattern ma
5252
r[type.tuple.repr]
5353
The values and representation of a tuple type are the same as a [struct type][type.struct.value] with the same fields and layout.
5454

55+
> [!NOTE]
56+
> In general, it is not guaranteed that any particular struct type will match the layout of a given tuple type.
57+
58+
r[type.tuple.padding]
59+
A tuple has the same [padding bytes][type.union.value.padding] as a struct type with the same fields and layout.
5560

5661
[^1]: Structural types are always equivalent if their internal types are equivalent.
5762
For a nominal version of tuples, see [tuple structs].

src/types/union.md

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,27 @@ The memory layout of a `union` is undefined by default (in particular, fields do
2424
*not* have to be at offset 0), but the `#[repr(...)]` attribute can be used to
2525
fix a layout.
2626

27+
## Union Values
28+
2729
r[type.union.value]
28-
A value of a union type consists of a sequence of bytes, corresponding to each [value byte][type.struct.value.value-bytes]. The value bytes of a union are represented exactly. Each [padding byte][type.struct.value.padding] is set to uninit when encoded.
30+
31+
r[type.union.value.value-bytes]
32+
A byte `b` in the representation of a struct or union is a value byte if there exists a field of that aggregate such that:
33+
* The field has some type `T`,
34+
* The offset of that field `o` is such that `b` falls at an offset in `o..(o+size_of::<T>())`,
35+
* Either `T` is a primitive type or the offset of `b` within the field is not a padding byte in the representation of `T`.
36+
37+
> [!NOTE]
38+
> A byte in a union is a value byte if it is a value byte in *any* field.
39+
40+
r[type.struct.value.padding]
41+
Every byte in an struct or union which is not a value byte is a padding byte. [Enum types][type.enum.value.value-padding], [tuple types][type.tuple.padding], and other types may also have padding bytes.
42+
43+
> [!NOTE]
44+
> Primitive types, such as integer types, do not have padding bytes.
45+
46+
r[type.union.value.encoding]
47+
A value of a union type consists of a sequence of bytes, corresponding to each [value byte][type.union.value.value-bytes]. The value bytes of a union are represented exactly. Each [padding byte][type.union.value.padding] is set to uninit when encoded.
2948

3049
> [!NOTE]
3150
> A given value byte is guaranteed allowed to be uninit if it is padding in any field, recursively expanding union fields. Whether a byte of a union is allowed to be uninit in any other case is not yet decided.

0 commit comments

Comments
 (0)