Replies: 2 comments
-
|
This is a great ask, and appreciate the time you've put in to explain the details. Priority levels are definitely useful. It's a question whether they belong in something as low level as mirai. Most modern async runtimes (Tokio, Go, Node) don't support task prioritisation in favour of fairness and simplicity. I'd need strong motivation from a concrete current use case (that can be shown to affect a whole class of users) to consider this. |
Beta Was this translation helpful? Give feedback.
-
|
Keeping mirai simple is a commendable goal, and I understand the need for more than just one person to find it useful. If you believe an extension to be the best path, is there documentation on how to write one? Unless I'm mistaken, it would require compiled code and interfacing at a low level. To me it seems that is reimplementing and not just an extension. I'm hoping there could be callbacks out of the default dispatcher, or if writing a dispatcher from scratch then a list of required calls/steps that the new dispatcher must support. For example, the DBI package has a set of methods that can/should be reimplemented (and have default functionality if they are not re-classed). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
As an alternative to FIFO, is there utility in a "prioritized" FIFO? (Or ... add
mirai(.priority=0, ...)as a default argument, and if never changed then behavior is identical to non-prioritized FIFO.)Use Case
Some of my work in the past has been in a HPC where I use 1000s of nodes. While I was not using
miraiat the time, it would be very useful there. One example where I find I needeverywhere(.)is to load the local definition of a package usingdevtools::load_all(). While it would obviously be much easier if the package is stable/unchanging, there are frequent-enough times where a bug needs to be fixed quickly without having to bring down and restart the network.Usually this is necessary in the middle of a run because of a flaw discovered, and that needs to be patched asap, ideally it would be patched before tasks indicated by
info()["awaiting"]are dispatched for evaluation. I think there are a few ways this can be done:Approach 1:
stop_mirai()Keeping track of all dispatched tasks, I can
stop_mirai(...)all of them, calleverywhere(load_all(..)), and then redefine the tasks. This can work, though it requires a little more overhead to remember how all of those tasks were originally defined. This is not unreasonable.Approach 2: centralized sentinel variable
I might use redis or some other central store of "should reload" indicators. For example,
This doesn't implement a prioritized queue per-se, it only allows for rather simple "always check this first" pre-loading of one specific expression.
Approach 3: prioritized FIFO
If the fifo dispatcher had a simple prioritization, then it would be possible to preempt the awaiting tasks. For instance, while not working code, a priority-fifo might work as:
In this example, the first 5 tasks are running on the old code. As nodes free-up on the first five tasks, the higher-priority
load_all(.)tasks are pushed next. Once those are done, the remainder of the1:10tasks are pushed next.This notion is distinct from compute clusters, since I need the higher-priority tasks to have a side-effect (updated global environment) on all nodes.
Design
Generalized, this supports the notion of more-important long-running tasks, whether it be for meta-tasks for side-effect (
load_all("...")) or for other tasks that need to be finished sooner than everything currently waiting to be dispatched..prioritylevel, everything is fifo.(I have no strong opinion on whether "0" is the lowest or highest priority, if negatives are allowed, if priorities should be bounded, etc.)
Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions