-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Consolidate CMSG_* implementations #4908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b15d3d5 to
e9e4e18
Compare
This comment has been minimized.
This comment has been minimized.
tgross35
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for going through the work here! I need to do a more thorough read-through, but have a few initial comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this one actually getting picked up? The module is defined here
Lines 9 to 12 in ab195eb
| mod libc { | |
| pub(crate) mod signal; | |
| pub(crate) mod unistd; | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a pub(crate) use libc::*; a bit further down :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, see what you mean. I seem to have made it into a mod.rs whilst keeping the mod libc {} somehow. I probably did when adding a libc/sys/socket.rs, realizing later on that it should be under xnu/, (hence the out-of-place freebsd copy-pasta).
Anyhow, I've now removed the libc/sys/socket.rs, making the mod.rs creation irrelevant to this PR... Do you want med to keep it anyway?
| // HACK: AIX does not use any alignment/padding for ancillary data. Setting | ||
| // this to 1 it makes possible to reuse the CMSG_* implementatinos as the extra | ||
| // CMSG_ALIGN calls become no-op. | ||
| #[cfg(target_os = "aix")] | ||
| pub(crate) type __ALIGN_BOUNDARY = u8; | ||
|
|
||
| #[cfg(target_os = "android")] | ||
| pub(crate) type __ALIGN_BOUNDARY = usize; | ||
|
|
||
| #[cfg(target_vendor = "apple")] | ||
| pub(crate) type __ALIGN_BOUNDARY = u32; | ||
|
|
||
| #[cfg(target_os = "hurd")] | ||
| pub(crate) type __ALIGN_BOUNDARY = usize; | ||
|
|
||
| #[cfg(target_os = "cygwin")] | ||
| pub(crate) type __ALIGN_BOUNDARY = usize; | ||
|
|
||
| #[cfg(target_os = "dragonfly")] | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
|
|
||
| #[cfg(target_os = "emscripten")] | ||
| pub(crate) type __ALIGN_BOUNDARY = usize; | ||
|
|
||
| #[cfg(target_os = "fuchsia")] | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
|
|
||
| #[cfg(target_os = "haiku")] | ||
| pub(crate) type __ALIGN_BOUNDARY = usize; | ||
|
|
||
| #[cfg(target_os = "l4re")] | ||
| pub(crate) type __ALIGN_BOUNDARY = usize; | ||
|
|
||
| #[cfg(target_os = "linux")] | ||
| pub(crate) type __ALIGN_BOUNDARY = usize; | ||
|
|
||
| #[cfg(target_os = "nto")] | ||
| pub(crate) type __ALIGN_BOUNDARY = usize; | ||
|
|
||
| #[cfg(target_os = "redox")] | ||
| pub(crate) type __ALIGN_BOUNDARY = size_t; | ||
|
|
||
| #[cfg(target_os = "vxworks")] | ||
| pub(crate) type __ALIGN_BOUNDARY = usize; | ||
|
|
||
| cfg_if! { | ||
| if #[cfg(any(target_os = "illumos", target_os = "solaris"))] { | ||
| #[cfg(target_arch = "sparc64")] | ||
| pub(crate) type __ALIGN_BOUNDARY = u64; | ||
| #[cfg(not(target_arch = "sparc64"))] | ||
| pub(crate) type __ALIGN_BOUNDARY = u32; | ||
| } | ||
| } | ||
|
|
||
| cfg_if! { | ||
| if #[cfg(target_os = "netbsd")] { | ||
| cfg_if! { | ||
| if #[cfg(target_arch = "x86")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_int; | ||
| } else if #[cfg(target_arch = "x86_64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
| } else if #[cfg(target_arch = "aarch64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_int; | ||
| } else if #[cfg(target_arch = "arm")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_longlong; | ||
| } else if #[cfg(target_arch = "mips")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_longlong; | ||
| } else if #[cfg(target_arch = "powerpc")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_double; | ||
| } else if #[cfg(target_arch = "riscv64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
| } else if #[cfg(target_arch = "sparc64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = u128; | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| cfg_if! { | ||
| if #[cfg(target_os = "freebsd")] { | ||
| cfg_if! { | ||
| if #[cfg(target_arch = "x86")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
| } else if #[cfg(target_arch = "x86_64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
| } else if #[cfg(target_arch = "aarch64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_longlong; | ||
| } else if #[cfg(target_arch = "arm")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_int; | ||
| } else if #[cfg(target_arch = "powerpc")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_int; | ||
| } else if #[cfg(target_arch = "powerpc64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
| } else if #[cfg(target_arch = "riscv64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_longlong; | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| cfg_if! { | ||
| if #[cfg(target_os = "openbsd")] { | ||
| cfg_if! { | ||
| if #[cfg(target_arch = "x86")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_int; | ||
| } else if #[cfg(target_arch = "x86_64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
| } else if #[cfg(target_arch = "aarch64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
| } else if #[cfg(target_arch = "arm")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_double; | ||
| } else if #[cfg(target_arch = "mips64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = u64; | ||
| } else if #[cfg(target_arch = "powerpc")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_double; | ||
| } else if #[cfg(target_arch = "powerpc64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
| } else if #[cfg(target_arch = "riscv64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = c_long; | ||
| } else if #[cfg(target_arch = "sparc64")] { | ||
| pub(crate) type __ALIGN_BOUNDARY = u128; | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're trying to move away from this kind of nested config in the new structure. Could you move them to the respective socket modules instead?
Also, this is something used so the shared impl is possible rather than actually being in source, right? A comment to that effect would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do these definitions come from by the way? Looking at e.g. NetBSD I see https://github.com/NetBSD/src/blob/e46385ea0f041fae85cfb2a10b8b4910124827b8/sys/sys/socket.h#L544-L548 with __ALIGNBYTES defined in a few different places https://github.com/search?q=repo%3ANetBSD%2Fsrc+%2F%23define+__ALIGNBYTES%2F&type=code, but nothing using types like double.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know :/ They were in their respective modules at first, (not the new/ ones), but then I started to get some lint warnings. Something about "declaration in module after use", can't remember which one exactly.
I left them here, expecting some discussion on what best to do. Should have been a bit clearer on that part.
They're technically all used solely for CMSG_ALIGN. Went ahead and removed pub(crate) to make that clearer.
Where do these definitions come from by the way?
Yeah, I also noticed a lot of inconsistencies with upstream, notably in what type is used for the size_of. Most are pretty harmless, Ex. Linux-like CMSG_ALIGN implementations normally use size_t or long, but this crate will often use usize.
Decided on not "correct" these. Wasn't keen on creating any unintended breaking changes. The one's I did change though were those that where previously defined as the result of size_of::<something>(), opting for the corresponding uN type alias. Solarish for example, had its alignment constant changed from const _CMSG_HDR_ALIGNMENT: usize = 8; to type __ALIGN_BOUNDARY = u64;.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what do you think? Make an exception here to using nested config in new and add a source comment on why? I may otherwise be a bit confusing that common/posix/sys/socket.rs is importing a single use pub(crate) __ALIGN_BOUNDARY defined in a <target>/sys/socket.rs.
src/new/common/bsd.rs
Outdated
| pub(crate) mod sys { | ||
| pub(crate) mod socket { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make sys a new directory and socket a new module in it? I've been trying to avoid >1 layer of nesting in the new modules, and sys tends to have enough contents that it's worth its own directory.
(Applies to a few places)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done ️✌️ Anything still missing?
src/new/common/posix/sys/socket.rs
Outdated
| // TODO: why not take mhdr by ref? | ||
| let msghdr = unsafe { *mhdr }; | ||
|
|
||
| // SAFETY: | ||
| // - Caller ensured pointer is not null. | ||
| // - Pointer retrieved from either CMSG_FIRSTHDR or CMSG_NXTDR, | ||
| // thus ensuring that is lies between mhdr_addr and mhdr_addr + | ||
| // msg_controllen. | ||
| let cmsghdr = unsafe { *cmsg }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did these come from the original impls?
I would slightly prefer reading the fields only as-needed, so as to avoid creating UB if the struct is partially initialized. IMO it's a terrible idea to ever use non-zeroed libc types but unfortunately people do it, so we should avoid giving them more unsoundness then they'd get from the same macro calls in C.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Changed it to:
let cmsg_len = unsafe { *cmsg }.cmsg_len;
// ..
let mut max_addr = {
// IMPROVEMENT: why not take mhdr by ref?
let msghdr = unsafe { *mhdr };
msghdr.msg_control as usize + msghdr.msg_controllen as usize
};| #[cfg(not(any( | ||
| target_env = "musl", | ||
| target_env = "ohos", | ||
| target_os = "emscripten", | ||
| target_os = "fuchsia" | ||
| )))] | ||
| pub unsafe fn CMSG_NXTHDR(mhdr: *const msghdr, cmsg: *const cmsghdr) -> *mut cmsghdr { | ||
| unsafe { next_impl(mhdr, cmsg, true) } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BSDs define a CMSG_NXTHDR too right? Doesn't this wind up duplicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrote it in such a way that the BSDs reuse the CMSG_NXTHDR, something like;
// common/bsd
pub unsafe fn CMSG_NXTHDR(mhdr: *const msghdr, cmsg: *const cmsghdr) -> *mut cmsghdr {
if cmsg.is_null() {
CMSG_FIRSTHDR(mhdr)
} else {
CMSG_NXTHDR(mhdr, cmsg)
}
}It's actually the more POSIX compliant way:
If the first argument is a pointer to a msghdr structure and the second argument is a null pointer, this macro shall be equivalent to CMSG_FIRSTHDR(mhdr).
https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/sys_socket.h.html
| pub(crate) unsafe fn next_impl( | ||
| mhdr: *const msghdr, | ||
| cmsg: *const cmsghdr, | ||
| allow_zero_sized_payload: bool, | ||
| ) -> *mut cmsghdr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything here that couldn't be made const at our MSRV? I think you may as well constify CMSG_NXTHDR if so, I could imagine it's occasionally useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the usize pointer casts, tried to look into it as well.
Transmuting pointers to integers in a const context is undefined behavior, unless the pointer was originally created from an integer.
https://doc.rust-lang.org/std/mem/fn.transmute.html#transmutation-between-pointers-and-integers
|
Reminder, once the PR becomes ready for a review, use |
The original intent was to simplify the reasoning for how they differ across platforms. It should as a consequence also make it easier to extend target support. Changes: - CMSG_* functions which can be const are all marked const. - Removes unsafe from `CMSG_ALIGN`, `CMSG_SPACE`, `CMSG_LEN`. Fixes: Custom CMSG_FIRSTHDR implementation for VxWorks had missed to check that `mhdr.msg_controllen >= size_of::<cmsghdr>()`, as per POSIX 1003.1-2024. Documents: - Usage and safety requirements. - Adds missing references to upstream headers.
e9e4e18 to
ab33767
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
@rustbot ready |
Builds upon #4903MergedThe original intent was to simplify the reasoning for how they differ across platforms. It should as a consequence also make it easier to extend target support.
Changes:
CMSG_ALIGN,CMSG_SPACE,CMSG_LEN.Fixes:
mhdr.msg_controllen >= size_of::<cmsghdr>(), as per POSIX 1003.1-2024.Documents:
Breaking API change Suggestions
Perhaps CMSG_FIRSTHDR can take mhdr by reference? First thing being done is otherwise a raw pointer deref under practically no safety guarantees. Doing so would also remove the need to mark it unsafe.
Same reasoning can be applied to mhdr in CMSG_NXTHDR, difference being that the function must remain unsafe for dereferencing cmsg.
Footnotes
Except
CMSG_DATAon solarish. It doesalign(cmsg_addr + size_of(cmsg))rather thancmsg_addr + align(size_of(cmsg>)). The others can get away with not casting the cmsg pointer to usize by instead casting cmsg to a byte pointer and doing the addition usingbyte_ptr.offset(). ↩