Interest Group: ABI compatible datatypes #1516
Replies: 5 comments 3 replies
-
Notes from the meeting on 2025-07-28:
|
Beta Was this translation helpful? Give feedback.
-
Here are some notes I had collected on the topic of padding bytes last week. This is just for reference, I believe we covered pretty much everything in our meeting last Monday. ContextComposite types (structs, tuples, enums with fields) can contain internal padding, which is inserted into the type layout by the compiler to ensure proper alignment of all fields. Even types without internal padding can have external padding to ensure alignment within a slice. Example: #[repr(C)]
struct TypeWithExternalPadding {
a: u32,
b: u8,
// 3 invisible bytes of padding at this point, to ensure alignment of the next MyType in a slice
} ProblemThe content of these padding bytes is undefined. This means that reading their value constitutes undefined behavior (UB) and must be avoided under all circumstances. Example: #[repr(C)]
struct MyType {
x: u8,
// Bytes 1...3 are internal padding
y: u32,
}
let value = MyType { x: 5, y: 6 };
// Fully initialize buffer with zeros:
let mut buffer = [0_u8; size_of::<MyType>()];
// Write a value which contains uninitialized padding bytes to the buffer:
let src = &value as *const MyType as *const u8;
unsafe {
buffer
.as_mut_ptr()
.copy_from_nonoverlapping(src, size_of::<MyType>());
}
// The bytes in buffer[1..4] are now also uninitialized, so the following is immediate UB:
let byte = buffer[1]; As a consequence, when a type which contains padding is transmitted over shared memory IPC, the underlying memory area contains uninitialized bytes. Reading those bytes would be UB, strictly speaking. Rust's memory-copying functions, like If the shared memory has been written by another process, the compiler doesn't know that those bytes are uninitialized, so it can't assume UB. However, this scenario might be considered too fragile to be a reliable solution. Uninitialized padding bytes are no problem when the ABI compatible types are used in regular code. Usually, the memory for the IPC doesn't need to be accessed in an untyped way, but there are a few situations where a raw byte slice must be read:
SolutionsSolution 1: explicit padding bytesWhen the type declarations are automatically generated from a definition, the generator can insert explicit padding bytes. Example: #[repr(C)]
struct MyType {
x: u8,
_pad1: u8
_pad2: u8
_pad3: u8
y: u32,
} Benefits:
Drawbacks:
Solution 2: initialize padding bytes on demandInformation about the location of padding bytes can be added to the type definition. At runtime, this information can be used to initialize the padding bytes. Benefits:
Drawbacks:
|
Beta Was this translation helpful? Give feedback.
-
And here are my notes on how to provide the definitions for ABI compatible types: ContextABI compatible types require consistent layouts across different programming languages (primarily Rust and C++).
Approach 1: definitions in a domain-specific languageType definitions are written in a special domain-specific language (DSL), from which Rust and C++ code is generated, as well as documentation and type definition information for runtime usage. Benefits:
Drawbacks:
Approach 2: definitions in regular codeType definitions are written in idiomatic Rust or C++ code, from where they can be extracted for processing. Benefits:
Drawbacks:
Approach 3: compromise between DSL and regular codeBoth Rust and C++ already provide everything needed to precisely define all allowed ABI compatible types Selecting Rust would probably be preferable to C++, because the Benefits:
Drawbacks:
Existing DSLsExisting DSLs for interface definitions (IDLs) typically don't support all types we want to support (now or in the future). |
Beta Was this translation helpful? Give feedback.
-
I had a look at the requirements which iceoryx2 imposes on types used for IPC, and the most important constraint is the |
Beta Was this translation helpful? Give feedback.
-
Bad news on the padding front. Apparently neither Rust nor C++ preserve padding bytes when copying or moving in a typed way (i.e., when not using memcpy). This means that when you have an instance of a type with padding, and you initialize those padding bytes, and then assign this instance to another variable, that variable will have uninitialized padding bytes. In C++ you can probably get around this by overriding the copy and move constructors and assignment operators to use memcpy, but in Rust there's no way to do this. And even then you'd have the problem of unused vector slots: Every time you're moving or copying a vector, you'd have to memset all unused slots in the target, or copy the full capacity of the vector (and not just the length). An alternative would be to sanitize the padding bytes after construction, before transmission. Something like this: let publisher = IpcService::<SampleType>().create_publisher();
...
let mut sample: SampleType = generate_sample();
sample.sanitize_padding();
publisher.publish(sample); where impl Publisher<T> {
pub fn publish(&self, mut sample: T) {
sample.sanitize_padding();
self.send_to_subscribers(sample);
}
} The |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Dummy content
Beta Was this translation helpful? Give feedback.
All reactions