Skip to content

Conversation

@andyross
Copy link
Contributor

Zync?! What is this thing?

Zync is a replacement for Zephyr's bottom layer of kernel synchronization primitives. It aims at best-in-class features, performance and size characteristics and should be preferable for existing users in all measurable ways.

Why should I want to use this API?

You probably don't. If you're happy with existing tools like k_mutex, k_sem and k_condvar, you should continue to use those interfaces. They represent industry standard idioms that Zephyr will support forever. They're just being implemented in terms of tiny wrappers over Zync and getting smaller and much faster in the process.

In essence, Zync fits into the Zephyr ecosystem in exactly the way that the futex system call does in Linux. It's the base layer for synchronization, but not something application (or even subsystem) code will tend to concern itself with.

What advantages does it bring?

Code size

(Actually this bit is a little spun. As of right now, the extra code needed to support the weirder edges of the older APIs is dominating and k_zync builds are actually coming out a few hundred bytes larger. But a few optimization passes before merge should get it under control)

Zephyr's synchronization paradigms are too complicated. We have k_mutex, but that is a priority-inheriting lock with a somewhat heavyweight implementation. Thus, much code in the tree uses k_sem as a "lock" instead. But now we have condition variables, which force the use of k_mutex anyway. And none of these work well in userspace (see below), so we have an additional sys_futex primitive, build on top of another k_futex kernel interface. And on top of futex, we have sys_sem and sys_mutex, designed to provide faster implementations for userspace processes (except that the sys_mutex implementation was never finished and is just a stub around k_mutex). So we have frustrations, like for example k_condvar still requires a k_mutex for its paired lock, and won't accept a semaphore (more commonly used) or sys_mutex (supposed to be faster).

Now, k_zync() is a single system call, used to implement (literally!) every kernel synchronization call from any of those users. And it's only barely more complicated than any one of them individually.

Performance

Zync, like sys_futex, is implemented naturally as an unlocked atomic CAS operation for the common/uncontended situations where the kernel does not need to modify thread state (e.g. taking a lock that is available, giving a semaphore with no waiters...). It falls through into the system call only when needed.

This is faster for all code everywhere, and much faster for userspace code. And it works for everything. Now mutexes and semaphores and condition variables get the same performance boost that futexes had.

It does require a mild API change though. Userspace code that needs to define a Zync needs to specify which memory domain will contain the lockless "atom". There is a legacy path that allows entirely-in-kernel-space Zync objects that can be used to preserve API compatibility at the expense of performance.

Simplicity

As mentioned, Zephyr futexes can do almost everything that Zyncs can, but futexes are an extremely low level operation (cribbed mostly unchanged from Linux) not well suited to application use. In Linux, the expectation is that the C library then builds its synchronization primitives on top of the futex facility. But Zephyr never really got that far (there is a futex-based sys_sem, but sys_mutex is a stub that still requires syscalls for every operation, and our futex isn't quite capable of representing a condition variable).

Extensibility

The ability to get to the underlying representation allows app code with complicated requirements to extend the synchronization paradigms in interesting ways. One new feature is already implemented: Zyncs have a "fair" setting that apps can toggle that controls whether or not the call is a scheduling point (fair-vs-fast is a persistent argument in synchronization, now we have both!). Long term we'll need to decide on an API for this, but advanced application authors can do this with the existing wrappers right now by accessing the internal Zync pointer.

Another is coming in the immediate term (i.e I need to get it done before merge): the fact that all the synchronization tools share the same backend means that abstractions based on the backend are shared too. Now you can wait on mutexes and condition variables in k_poll()!

Other easy API additions include things like "wake all" for semaphores (or "wake N"), or a "barrier" primitive (we actually already have pthread barriers, but now they can be a 1-line wrapper around zync). Middle-tier IPC primitives like k_queue, k_pipe or k_msgq can be likewise fit into the zync paradigm for even more code savings. There are lots of places in the IPC layer where code is solving the same "semaphore-style" thread blocking problem repeatedly.

What are the warts?

Bad name

I like it. But I'm weird.

Obtuse interface

It's not that bad, really. But it's a weird API with funny rules. Fundamentally a Zync is just a counting semaphore with an extra "reset" step that works in one of two ways, allowing it emulate both the "block unconditionally" and "atomically release lock" behavior of condition variables on the same code path and with minimal overhead.

(And as mentioned farther down: if you think this is weird go look at what Linux/glibc does to implement condition variables.)

Why not just use Futexes?

That's what Linux does, and Linux does pretty well for itself. And that's how this idea started. But Zyncs are better. The linux FUTEX_WAIT/WAKE interface that we implement turned out to be insufficient for condition variables too. So it had to be extended with FUTEX_WAKE_OP. That operation is even more obtuse than k_zync(), and in any case was never implement in Zephyr.

Priority inheritance

Traditionally k_mutex implements priority inheritance, boosting the priority of the thread that owns the lock. This is a great checklist feature, but in practice most code doesn't care about it (mostly because in real apps threads spend most of their time blocking on queues or semaphores or sleeps, not waiting on locks). And as it requires access to kernel state (maintaining the "owner" pointer), it can't use the fast/lockless/atomic trick.

So this is now configurable. By default, Zephyr builds without support for priority inheritance and k_mutex is fast. But you can turn it on with a kconfig, at the cost of making userspace mutex operations significantly slower (though no slower than they are today).

This seems like a good compromise to me, but it does complicate the API a bit. In the longer term we might consider leaving k_mutex as the non-PI/fast lock, and adding a k_pimutex type instead.

Recursive locking

There's another similar annoyance with k_mutex: by default it will allow threads to lock the same mutex multiple times, keeping a count of total calls such that the lock only releases when a matching number of unlock calls has occurred. This is a real feature offered by locks in various systems (our own irq_lock() always worked that way too). But it's mostly felt to have been a mistake and a bad code smell with new work; code that knows it can do this tends to abuse it and you end up with very large, hard-to-understand critical sections spread across multiple subsystems.

And we handle it mostly the same way. The core k_zync() call can support recursive locks, but it's hidden behind an off-by-default kconfig and incurs runtime overhead and forces all calls to be syscalls.

(Actually that doesn't have to be the case, the only problems that result from broken lock nesting are incorrect blocking and wakeups, which is something untrusted user code can do anyway. It's theoretically possible to put the nesting count into user memory and protect it with some lock in the atom, but that seemed hard to get right and IMHO had little justification since it's mostly a compatibility feature anyway.)

Semaphore Limits

Similar deal here: Zephyr has historically allowed semaphores to be created with an app-defined limit above which the k_sem_give() operation will saturate. The math for this needs to be present always, of course (because rollover would be a bug). But allowing the app to specify the limit means (as above) that the code to increment the count needs to have access to kernel state, and so can't be implemented with a lockless protocol from userspace.

This is sort of a silly complication, as AFAICT all the existing production usage of this limit is done solely to make semaphores act like locks, with a limit of 1. And there's no point to that when semaphores and mutexes share the same backend.

The solution is the same as for PI: semaphore limits are normally set to the field maximum and can't be changed. But when configured, you have access to the limit at the expense of performance. My guess is this feature will continue to be unused and we can drop it after a deprecation period.

k_poll

Again similarly, the existing k_poll API is hard to support with the fast userspace zync interface. So when it's enabled semaphores need to be kernel-only. I think the right answer here is to actually rework poll to be zync-only, actually. But for now it's in the same "it works, but without the performance advantages" category.

Futexes are being deprecated

The Futex API could (with some overhead -- a Zync atom isn't a bare uint32 for callers to mutate at will, which is what Futex wants) be implemented in terms of Zync, but I don't think it's worth the trouble. As mentioned above, the Zephyr futex implementation is incomplete (lacking support for WAKE_OP), uptake has been very poor, and Zync sits in exactly the spot in the API stack where Futex was always intended to be.

Likewise the futex-based sys_mutex and sys_sem are on the chopping block; they don't do anything that zyncs can't, and had even less uptake in the ecosystem.

(Again there's a little spin here. In fact these got broken in the current version of this PR and need to be fixed before merge).

... so the defaults are less capable now?

Yes, pretty much. The default states for k_sem and k_mutex are now trimmer, faster, more-easily-specified utilities more suitable for default users. Most apps don't need priority inheritance, recursive locks are almost always a bad idea, and IMHO semaphore value clamping was always just a mistake.

Really this whole project can be viewed as my sneaky way of side-channel-deprecating a bunch of features I don't think we should have put where we did in the first place. They're all still there, but you have to ask for them and pay the cost at the app level now.

@andyross andyross added the DNM This PR should not be merged (Do Not Merge) label Jul 27, 2022
@andyross
Copy link
Contributor Author

This is what I was playing with over my vacation.

The PR isn't quite ready yet, being somewhere between RFC and beta right
now. But all the core features are there and working and ready for
review. There's lots of work left, but it's mostly low-design-risk at
this point:

  • There remain a few glitches with kobj lifecycle management when USERSPACE=y

  • I broke sys_sem/sys_mutex and need to fix them (even if they're being deprecated)

  • Doxygen comments are present and IMHO reasonably clear, but I need to write an overview

  • As mentioned, it needs some optimization still.

  • I don't love the "zync_pair" trick I'm using for the wrappers, it's ended up being too big and doing too much of its work at runtime. I'm thinking it might be better to back it out and just have a single kconfig for "ALL_ZYNCS_IN_KERNEL" or whatever, given that most users would be expected to move to the new scheme rapidly.

  • I'm still banging test compliance into shape. I think all the hard bugs are behind me, but there are a surprising number of subsystems out there that wanted to whitebox semaphores, etc...

Basicaly, I'd expect to see this as a 3.3 feature rather than 3.2, as it may take a bit to settle.

Anyway, please review. I added all the normal suspects already, but please share far and wide as needed.

*
* @param atom_ptr Pointer to a k_zync_atom to modify
*/
#define K_ZYNC_ATOM_SET(atom_ptr) \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth pointing out that this API looks a little like a spinlock because that's basically what it is. Which opens the question of whether we should bring spinlocks into the fold and implement them in terms of a k_zync_atom_t, thus unifying ZYNC_VALIDATE and SPIN_VALIDATE. That seemed a bit too far for me in a first version, but something to consider.

@yperess yperess requested review from yperess and removed request for tejlmand July 27, 2022 12:16
@npitre
Copy link

npitre commented Jul 27, 2022

Looks nice from a casual first read.

Maybe there could be a couple k_zync() variants as to remove those runtime
conditions that are going to remain constant for the lifetime of a zync
object. Still needs to be optimized as you said.

Bad name

I like it. But I'm weird.

What about k_zusp? Looks goofy but that directly expands to "Zephyr
Universal Synchronization Primitive". And it should give pause to anyone
wanting to use this new interface just because it looks shiny otherwise.

@cfriedt
Copy link
Member

cfriedt commented Jul 27, 2022

This looks great, and I'm looking forward to seeing some numbers.

One question: why use the zync name if they're basically futexes? Ok 2 questions: Wouldn't it be better to just call them futexes?

Looks fine to me, in any case.

@sagarshah-leg
Copy link
Contributor

sagarshah-leg commented Nov 18, 2022

This might be too late, however, instead of it being a configurable feature would it be possible to have different interfaces/definitions for the different types of mutexes, semaphores, etc. so the user is not restricted to using only one kind throughout the whole code?

Also, won't using a mutex-without-priority-inheritance cause deadlock when used to control shared data access?

Copy link

@desowin desowin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't allow mutexes in ISRs. The mutex API already has legacy burden (dedicated return values for essentially flawed code, i.e. unlocking not owned mutex) and allowing mutexes to be used in ISRs will complicate things more.

#define TYPE 0xc

static ZTEST_BMEM SYS_MUTEX_DEFINE(mutex);
K_MUTEX_DEFINE(mutex);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you dropping the static and ZTEST_BMEM qualifiers?

* lock count is increased by 1.
*
* Mutexes may not be locked in ISRs.
* Mutexes may be used in ISRs, though blocking is not possible and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think mutexes should ever be allowed in ISRs, regardless if the underlying implementation can make it work or not, due to the very nature of mutexes. The big difference between semaphore and mutex is the ownership (semaphore doesn't have ownsership, mutex has). Only the owner is ever allowed to release (give) mutex (current Zephyr implementation of returning an error code when unlocking not owned mutex is really really bad choice; it is outright programmer error that should really be caught with an assert). It is impossible to set owner when in ISR (if the "special case" of locking in ISR was supported, then it would be an error if the mutex was not unlocked by the ISR). Allowing mutexes in ISRs sounds like an unnecessarily complication that encourages messy designs

When you have proper code, then for sure you can optimize away the owner parameter, but that should really be an optimization, not something to go for with default. Asserts for things that should never happen are really helpful not only during initial development, but also during any troubleshooting (the proper asserts can quickly reveal issues saving a lot of time).

* the calling thread as many times as it was previously locked by that
* thread.
*
* Mutexes may not be unlocked in ISRs, as mutexes must only be manipulated
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the note about ISRs should remain here.

__ASSERT(Z_PAIR_ATOM(&mutex->zp)->val == 0, "mutex not locked");
#endif
#ifdef Z_ZYNC_ALWAYS_KERNEL
/* Synthesize "soft failure" return codes. Needed by current
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion the "soft failures" are very big mistake in the first place. Unfortunately these cannot be easily removed (because it is in stable API), but I would really like them to go away. That is, unlocking a mutex you don't own is a very serious bug. The code that does so is essentially flawed. In my opinion unlocking a mutex you don't own is the same category error as dereferencing a NULL pointer.

* body and must not have side effects.
*
* @param atom Pointer to a k_zync_atom to modify
*/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be just me, but I only understood the description after understanding what the macro does (it definitely takes quite some minutes to understand).

The key elements for me to understand how this works were:

  • done is of k_zync_atom_t type because the for loop construct in C allows only initializing values of one type. The atomic_cas as far as I can tell is returning bool while the done.atomic is atomic_t. But done.atomic is really used as a bool.
  • the code inside K_ZYNC_ATOM_SET(&atom) { /* loop body */ } should set local new_atom variable to the value it would like to assign to passed in atom. While computing the new_atom value, the code should use old_atom which holds what was read from atom.
  • the ternary conditional operator (?) is essentially used to update the old_atom is someone else modified atom in between the previous atomic_get and atomic_cas and to do nothing if the atomic_cas succeeded
  • the loop body can leave atom unchanged with break;

I hope the above description clears it up for anyone that is trying to make sense out of it. Unfortunately, I have no idea how to make the description clearer. I am not sure if posting equivalent code would make things easier to understand, i.e.:

K_ZYNC_ATOM_SET(&atom) {
    /* code that sets new_atom or breaks */
}

is equivalent to

do {
    k_zync_atom_t old_atom = { .atomic = atomic_get(&(atom)->atomic) };
    k_zync_atom_t new_atom = old_atom;
    k_zync_atom_t done = {}; /* syntactic sugar */ 
    /* code that sets new_atom or breaks */
} while (!(done.atomic = atomic_cas(&(atom)->atomic, old_atom.atomic, new_atom.atomic));

};

static K_MUTEX_DEFINE(fdtable_lock);
K_MUTEX_DEFINE(fdtable_lock);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be K_MUTEX_STATIC_DEFINE instead?

K_CONDVAR_DEFINE(condvar);

static int done;
static volatile int done;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this go to fixes PR?

K_KERNEL_STACK_ARRAY_DEFINE(cpuhold_stacks, MAX_NUM_CPUHOLD, CPUHOLD_STACK_SZ);

static struct k_sem cpuhold_sem;
K_SEM_DEFINE(cpuhold_sem, 0, K_SEM_MAX_LIMIT);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably missed in last force-push, i.e. this should be K_SEM_STATIC_DEFINE()

CONFIG_ZTEST_NEW_API=y

# This exercises pathological k_mutex state deliberately, don't panic the kernel
CONFIG_ZYNC_VALIDATE=n
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing newline

suspended/restarted, or having their intended atom count
"stolen" by a higher priority thread) where k_zync would
naturally return early with -EAGAIN. Optional; most code
should be prepared to handle -EAGAIN on its own, where the
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Are there other RTOSes where the caller must be prepared to handle something like -EAGAIN?

@abrodkin abrodkin requested review from evgeniy-paltsev and removed request for rlubos December 5, 2022 10:33
@galak
Copy link
Contributor

galak commented Dec 13, 2022

Dev-Review (Dec 13th, 2022): @andyross still has some cleanups to finish. Thinks most of the issues have or will be addressed this week and will look how to support the old methods so we can try and get this in for this release cycle.

@laurenmurphyx64
Copy link
Contributor

Hi @andyross! :) Do you think this'll make it in for feature freeze next Friday (1/27)

@github-actions
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@nordicjm
Copy link
Contributor

@andyross any chance this'll make it in for 3.6? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.