Attempt to define certain representations in less confusing ways.

chorman0773 · chorman0773 · commit 900ee9309c02 · 2024-12-12T14:51:42.000-05:00
diff --git a/src/memory-model.md b/src/memory-model.md
@@ -51,6 +51,7 @@ A sequence of bytes is said to represent a value of a type, if the decode operat
 
 > [!NOTE]
 > Representation is related to, but is not the same property as, the layout of the type.
+> A type has a unique representation when each value is represented by exactly one byte sequence. Most primitive types have unique representations.
 
 r[memory.encoding.symmetric]
 The result of encoding a given value of a type is a sequence of bytes that represents that value.
diff --git a/src/types/array.md b/src/types/array.md
@@ -34,6 +34,7 @@ always bounds-checked in safe methods and operators.
 r[type.array.repr]
 An array value is represented by each element in ascending index order, placed immediately adjacent in memory.
 
+
 [_Expression_]: ../expressions.md
 [_Type_]: ../types.md#type-expressions
 [`usize`]: numeric.md#machine-dependent-integer-types
diff --git a/src/types/boolean.md b/src/types/boolean.md
@@ -21,7 +21,7 @@ r[type.bool.layout]
 An object with the boolean type has a [size and alignment] of 1 each.
 
 r[type.bool.repr]
-A `bool` is represented as a single initialized byte with a value of `0x00` corresponding to `false` and a value of `0x01` corresponding to `true`. This byte does not have a pointer fragment.
+A `bool` is represented as a single initialized byte with a value of `0x00` corresponding to `false` and a value of `0x01` corresponding to `true`. 
 
 > [!NOTE]
 > No other representations are valid for `bool`. Undefined Behaviour occurs when any other byte is read as type `bool`.
diff --git a/src/types/enum.md b/src/types/enum.md
@@ -28,14 +28,13 @@ An enum value corresponds to exactly one variant of the enum, and consists of th
 > [!NOTE]
 > An enum with no variants therefore has no values.
 
-r[type.enum.value.variant-padding]
-A byte is a padding byte in a variant `V` if the byte is not used for computing the discriminant, and the byte would be a padding byte in a struct consisting of the fields of the variant at the same offsets.
-
 r[type.enum.value.value-padding]
-A byte is a padding byte of an enum if it is a padding byte in each variant of the enum. A byte that is not a padding byte of an enum is a value byte.
+A byte is a [padding][type.union.value.padding] byte of an enum if that byte is not part of the representation of the discriminant of the enum, and in each variant it either:
+* Does not overlap with a field of the variant, or
+* Overlaps with a padding byte in a field of that variant.
 
 r[type.enum.value.repr]
-The representation of a value of an enum type includes the representation of each field of the variant at the appropriate offsets. When encoding a value of an enum type, each byte which is a padding byte in the variant is set to uninit. In the case of a [`repr(C)`][layout.repr.c.adt] or a [primitive-repr][layout.repr.primitive.adt] enum, the discriminant of the variant is represented as though by the appropriate integer type stored at offset 0.
+The representation of a value of an enum type includes the representation of each field of the variant at the appropriate offsets. When encoding a value of an enum type, each byte which is not use d to store a field of the variant or the discriminant is . In the case of a [`repr(C)`][layout.repr.c.adt] or a [primitive-repr][layout.repr.primitive.adt] enum, the discriminant of the variant is represented as though by the appropriate integer type stored at offset 0.
 
 > [!NOTE]
 > Most `repr(Rust)` enums will also store a discriminant in the representation of the enum, but the exact placement or type of the discriminant is unspecified, as is the value that represents each variant.
diff --git a/src/types/numeric.md b/src/types/numeric.md
@@ -70,7 +70,8 @@ r[type.numeric.repr.integer-width]
 The range of values an integer type can represent depends on its signedness and its width, in bits. The width of type `uN` or `iN` is `N`. The width of type `usize` or `isize` is the value of the `target_pointer_width` property.
 
 > [!NOTE]
-> There are exactly `1<<N` unique values of an integer type of width `N`.
+> There are exactly `1<<N` unique values of an integer type of width `N`. 
+> In particular, for an unsigned type, these values are in the range `0..(1<<N)` and for a signed type, are in the range `-(1<<(N-1))..(1<<(N-1))`, using rust range syntax.
 
 r[type.numeric.repr.unsigned]
 A value `i` of an unsigned integer type `U` is represented by a sequence of initialized bytes, where the `m`th successive byte according to the byte order of the platform is `(i >> (m*8)) as u8`, where `m` is between `0` and the size of `U`. None of the bytes produced by encoding an unsigned integer has a pointer fragment.
@@ -82,13 +83,26 @@ A value `i` of an unsigned integer type `U` is represented by a sequence of init
 > [!WARN]
 > On `little` endian, the order of bytes used to decode an integer type is the same as the natural order of a `u8` array - that is, the `m` value corresponds with the `m` index into a same-sized `u8` array. On `big` endian, however, the order is the opposite of this order - that is, the `m` value corresponds with the `size_of::<T>() - m` index in that array.
 
+
 r[type.numeric.repr.signed]
 A value `i` of a signed integer type with width `N` is represented the same as the corresponding value of the unsigned counterpart type which is congruent modulo `2^N`.
 
+> [!NOTE]
+> This encoding of signed integers is known as the 2s complement encoding. 
+
+r[type.numeric.repr.float-width]
+Each floating-point type has a width. The type `fN` has a width of `N`.
+
 r[type.numeric.repr.float]
-A floating-point value is represented the same as a value of the unsigned integer type with the same width given by its [IEEE 754-2019] encoding.
+A floating-point value is represented by the following decoding:
+* The byte sequence is decoded as the unsigned integer type with the same width as the floating-point type,
+* The resulting integer is decoded according to [IEEE 754-2019] into the format used for the type. 
+
+> [!NOTE]
+> The representation of each finite number and infinity is unique as a result of this. 
+> The exact behaviour of encoding and decoding NaNs is not yet decided 
 
 r[type.numeric.repr.float-format]
-The [IEEE 754-2019] `binary32` format is used for `f32`, and the `binary64` format is used for `f64`.
+The [IEEE 754-2019] `binary32` format is used for `f32`, and the `binary64` format is used for `f64`. The set of values for each floating-point type are determined by the respective format.
 
 [IEEE 754-2019]: https://ieeexplore.ieee.org/document/8766229
diff --git a/src/types/pointer.md b/src/types/pointer.md
@@ -106,11 +106,12 @@ A wide pointer or reference consists of a data pointer or reference, and a point
 r[type.pointer.value.wide-reference]
 The data pointer of a wide reference has a non-null address, well aligned for `align_of_val(self)`, and with provenance for `size_of_val(self)` bytes.
 
-r[type.pointer.value.wide-representation]
+r[type.pointer.value.wide-repr]
 A wide pointer or reference is represented the same as `struct WidePointer<M>{data: *mut (), metadata: M}` where `M` is the pointee metadata type, and the `data` and `metadata` fields are the corresponding parts of the pointer.
 
 > [!NOTE]
 > The `WidePointer` struct has no guarantees about layout, and has the default representation.
+> In particular, it is not guaranteed that you can write a struct type with the same layout as `WidePointer<M>`. 
 
 ## Pointer Provenance
 
diff --git a/src/types/struct.md b/src/types/struct.md
@@ -36,20 +36,6 @@ r[type.struct.value]
 r[type.struct.value.intro]
 A value of a struct type consists of a list of values for each field.
 
-r[type.struct.value.value-bytes]
-A byte `b` in the representation of an aggregate is a value byte if there exists a field of that aggregate such that:
-* The field has some type `T`,
-* The offset of that field `o` is such that `b` falls at an offset in `o..(o+size_of::<T>())`,
-* Either `T` is a primitive type or the offset of `b` within the field is a value byte in the representation of `T`.
-
-> [!NOTE]
-> A byte in a union is a value byte if it is a value byte in *any* field.
-
-r[type.struct.value.padding]
-Every byte in an aggregate which is not a value byte is a padding byte.
-
-> [!NOTE]
-> Enum types can also have padding bytes.
 
 r[type.struct.value.encode-decode]
 When a value of a struct type is encoded, each field of the struct is encoded at its corresponding offset and each byte that is not within a field of the struct is set to uninit.
diff --git a/src/types/textual.md b/src/types/textual.md
@@ -18,6 +18,9 @@ or 0xE000 to 0x10FFFF range.
 r[type.text.char-repr]
 A value of type `char` is represented as the value of type `u32` with value equal to the code point that it represents.
 
+> [!NOTE]
+> The representation of `char` is unique.
+
 r[type.text.str-value]
 A value of type `str` is represented the same way as `[u8]`, a slice of
 8-bit unsigned bytes. However, the Rust standard library makes extra assumptions
diff --git a/src/types/tuple.md b/src/types/tuple.md
@@ -52,6 +52,11 @@ Tuple fields can be accessed by either a [tuple index expression] or [pattern ma
 r[type.tuple.repr]
 The values and representation of a tuple type are the same as a [struct type][type.struct.value] with the same fields and layout.
 
+> [!NOTE]
+> In general, it is not guaranteed that any particular struct type will match the layout of a given tuple type.
+
+r[type.tuple.padding]
+A tuple has the same [padding bytes][type.union.value.padding] as a struct type with the same fields and layout.
 
 [^1]: Structural types are always equivalent if their internal types are equivalent.
       For a nominal version of tuples, see [tuple structs].
diff --git a/src/types/union.md b/src/types/union.md
@@ -24,8 +24,27 @@ The memory layout of a `union` is undefined by default (in particular, fields do
 *not* have to be at offset 0), but the `#[repr(...)]` attribute can be used to
 fix a layout.
 
+## Union Values 
+
 r[type.union.value]
-A value of a union type consists of a sequence of bytes, corresponding to each [value byte][type.struct.value.value-bytes]. The value bytes of a union are represented exactly. Each [padding byte][type.struct.value.padding] is set to uninit when encoded.
+
+r[type.union.value.value-bytes]
+A byte `b` in the representation of a struct or union is a value byte if there exists a field of that aggregate such that:
+* The field has some type `T`,
+* The offset of that field `o` is such that `b` falls at an offset in `o..(o+size_of::<T>())`,
+* Either `T` is a primitive type or the offset of `b` within the field is not a padding byte in the representation of `T`.
+
+> [!NOTE]
+> A byte in a union is a value byte if it is a value byte in *any* field.
+
+r[type.struct.value.padding]
+Every byte in an struct or union which is not a value byte is a padding byte. [Enum types][type.enum.value.value-padding], [tuple types][type.tuple.padding], and other types may also have padding bytes.
+
+> [!NOTE]
+> Primitive types, such as integer types, do not have padding bytes.
+
+r[type.union.value.encoding]
+A value of a union type consists of a sequence of bytes, corresponding to each [value byte][type.union.value.value-bytes]. The value bytes of a union are represented exactly. Each [padding byte][type.union.value.padding] is set to uninit when encoded.
 
 > [!NOTE]
 > A given value byte is guaranteed allowed to be uninit if it is padding in any field, recursively expanding union fields. Whether a byte of a union is allowed to be uninit in any other case is not yet decided.