Commit 8f75f99
committed
[monarch] The root client is just a PythonActor
This diff makes the root client actor just another `PythonActor`.
# Why?
Right now the monarch codebase is peppered with special handling to distinguish between normal python actors and the root client "actor", which has type `()` and is actually just a detached `Instance` with no actor loop; it therefore has no message handlers and can't even process supervision events. As a result, we have to wrap the current context's instance in a special `ContextInstance` enum, and everywhere we want to use it, we either have to use the `instance_dispatch!` macro, or insert code that looks like:
```rust
match instance {
ContextInstance::PythonActor(ins) => { do something },
ContextInstance::Client(ins) => { do something else },
}
```
This makes the code more error-prone and harder to understand, with the added complication that the client handling is often not idiomatic w.r.t hyperactor due to the lack of message handlers/actor loop. Some examples:
- [Confusing supervision handling where `owner` might not be defined but `is_owned` is still true and so we need to call into a special `unhandled` function instead of continuing to propagate up the hierarchy](https://fburl.com/code/andy3ggr)
- [The root client can't have child actors due to no supervision event handling, so they have to be spawned directly on the root client proc, and even then, there is no way for the supervision event to reach `monarch.actor.unhandled_fault_hook`](https://fburl.com/code/kqd2iwvc)
- [The root client handles undeliverable messages via a bespoke tokio task/thread](https://fburl.com/code/jjgfy5d5)
Making the root client a normal python actor solves these problems, because:
- We don't need a `ContextInstance` enum anymore -- `PyInstance` *always* contains `Instance<PythonActor>`.
- Supervision events follow a unified path as they bubble up through the hierarchy, and *every* unhandled event reaches `RootClientActor.__supervise__`, defined in python, without special handling.
- The root client can handle undeliverable messages using `RootClientActor._handle_undeliverable_message`, defined in python, without special handling.
# Navigating the code changes (guide for reviewers)
There are a lot of file changes here but only some of them are important. I would recommend reviewing them in the following order:
- `monarch/_src/actor/actor_mesh.py`
- Defines the `RootClientActor` python class and its behavior.
- `hyperactor/src/proc.rs`
- Introduces `Proc::actor_instance::<A>(...)`, which returns a detached `A`-typed actor instance/handle, along with its supervision receiver, signal receiver and message receiver.
- `monarch_hyperactor/src/actor.rs`
- Introduces `PythonActor::bootstrap_client()`, which replaces `global_root_client()` in the root client context. This function starts the root client proc, spawns the `RootClientActor`, starts its actor loop and returns the `Instance<PythonActor>`.
- The root client actor can now handle `SupervisionFailureMessage` just like every other actor in the hierarchy.
- Implements `PythonActor::handle_supervision_event` to pass the event to the actor's `SupervisionFailureMessage` handler. This way, **every unhandled supervision event in the system makes its way to `RootClientActor.__supervise__` eventually**.
- `monarch_hyperactor/src/v1/actor_mesh.rs`
- Deletes the special handling from the actor states monitor like `is_owned` and the explicit `unhandled_fault_hook` call. If `owner` is defined, it forwards the `SupervisionFailureMessage`, or else it does nothing.
- Fixes (what I think was) a bug in `send_state_change`. A supervision event should only be forwarded as `SupervisionFailureMessage` to `owner` if it represents a failure. With the logic before this diff, stopping an actor mesh from inside an actor endpoint would generate a supervision event that reaches `unhandled_fault_hook` and crashes the root process even if it was a healthy stop.
- `monarch_hyperactor/src/context.rs`
- Deletes `ContextInstance` and replaces it in `PyInstance` with `Instance<PythonActor>`.
- The rest of the changes are pretty much just cleaning up `instance_dispatch!` calls.
Differential Revision: [D87296357](https://our.internmc.facebook.com/intern/diff/D87296357/)
**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D87296357/)!
ghstack-source-id: 325421344
Pull Request resolved: #19851 parent 240ebec commit 8f75f99
File tree
23 files changed
+557
-638
lines changed- hyperactor_mesh/src
- hyperactor/src
- monarch_extension/src
- monarch_hyperactor/src
- code_sync
- v1
- monarch_rdma/extension
- python
- monarch/_src/actor
- tests
23 files changed
+557
-638
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
312 | 312 | | |
313 | 313 | | |
314 | 314 | | |
315 | | - | |
316 | | - | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
317 | 319 | | |
318 | 320 | | |
319 | 321 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
411 | 411 | | |
412 | 412 | | |
413 | 413 | | |
414 | | - | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
415 | 417 | | |
416 | 418 | | |
417 | 419 | | |
| |||
530 | 532 | | |
531 | 533 | | |
532 | 534 | | |
533 | | - | |
534 | | - | |
535 | | - | |
536 | | - | |
537 | | - | |
| 535 | + | |
538 | 536 | | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
539 | 561 | | |
540 | | - | |
| 562 | + | |
541 | 563 | | |
542 | 564 | | |
543 | | - | |
| 565 | + | |
544 | 566 | | |
545 | 567 | | |
546 | | - | |
547 | | - | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
548 | 572 | | |
549 | | - | |
550 | 573 | | |
551 | | - | |
552 | | - | |
| 574 | + | |
553 | 575 | | |
554 | 576 | | |
555 | 577 | | |
| |||
874 | 896 | | |
875 | 897 | | |
876 | 898 | | |
877 | | - | |
| 899 | + | |
878 | 900 | | |
879 | 901 | | |
880 | 902 | | |
881 | | - | |
| 903 | + | |
882 | 904 | | |
883 | 905 | | |
884 | 906 | | |
| |||
891 | 913 | | |
892 | 914 | | |
893 | 915 | | |
894 | | - | |
| 916 | + | |
895 | 917 | | |
896 | 918 | | |
897 | 919 | | |
| |||
902 | 924 | | |
903 | 925 | | |
904 | 926 | | |
905 | | - | |
| 927 | + | |
906 | 928 | | |
907 | 929 | | |
908 | | - | |
| 930 | + | |
909 | 931 | | |
910 | 932 | | |
911 | 933 | | |
| |||
1451 | 1473 | | |
1452 | 1474 | | |
1453 | 1475 | | |
1454 | | - | |
| 1476 | + | |
| 1477 | + | |
1455 | 1478 | | |
1456 | 1479 | | |
1457 | 1480 | | |
| |||
1483 | 1506 | | |
1484 | 1507 | | |
1485 | 1508 | | |
1486 | | - | |
| 1509 | + | |
1487 | 1510 | | |
1488 | 1511 | | |
1489 | 1512 | | |
| |||
1519 | 1542 | | |
1520 | 1543 | | |
1521 | 1544 | | |
1522 | | - | |
1523 | | - | |
| 1545 | + | |
| 1546 | + | |
1524 | 1547 | | |
1525 | 1548 | | |
1526 | 1549 | | |
| |||
2041 | 2064 | | |
2042 | 2065 | | |
2043 | 2066 | | |
2044 | | - | |
| 2067 | + | |
2045 | 2068 | | |
2046 | 2069 | | |
2047 | 2070 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
30 | | - | |
31 | | - | |
32 | 31 | | |
33 | 32 | | |
34 | 33 | | |
| |||
279 | 278 | | |
280 | 279 | | |
281 | 280 | | |
282 | | - | |
283 | | - | |
284 | | - | |
285 | | - | |
286 | | - | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
287 | 288 | | |
288 | 289 | | |
289 | 290 | | |
290 | 291 | | |
291 | 292 | | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | | - | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
308 | 309 | | |
309 | 310 | | |
310 | 311 | | |
| |||
324 | 325 | | |
325 | 326 | | |
326 | 327 | | |
327 | | - | |
328 | | - | |
329 | | - | |
330 | | - | |
331 | | - | |
332 | | - | |
333 | | - | |
334 | | - | |
335 | | - | |
336 | | - | |
337 | | - | |
338 | | - | |
339 | | - | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
340 | 339 | | |
341 | 340 | | |
342 | 341 | | |
| |||
354 | 353 | | |
355 | 354 | | |
356 | 355 | | |
357 | | - | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
368 | 365 | | |
369 | 366 | | |
370 | 367 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
25 | | - | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
104 | 101 | | |
105 | 102 | | |
106 | 103 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
50 | 49 | | |
51 | 50 | | |
52 | 51 | | |
| |||
140 | 139 | | |
141 | 140 | | |
142 | 141 | | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
154 | 150 | | |
155 | 151 | | |
156 | 152 | | |
| |||
231 | 227 | | |
232 | 228 | | |
233 | 229 | | |
234 | | - | |
235 | | - | |
| 230 | + | |
236 | 231 | | |
237 | 232 | | |
238 | 233 | | |
| |||
817 | 812 | | |
818 | 813 | | |
819 | 814 | | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
820 | 819 | | |
821 | 820 | | |
822 | 821 | | |
| |||
0 commit comments