Skip to content

Conversation

mariusae
Copy link
Member

@mariusae mariusae commented Oct 3, 2025

Stack from ghstack (oldest at bottom):

Make proc and actor spawning subject to a liveness parameter: we expect the spawn to either complete, or else yile updated regularly (configured by a new parameter PROC_SPAWN_MAX_IDLE and ACTOR_SPAWN_MAX_IDLE. This is a bridge solution to provide good error messaging, and to prevent applications from halting. Once we have owned host meshes w/ their own comm actors, we can use the general purpose liveness mechanism for this.

We also clean up the error messages, refactor the status fence, and print the status of each individual proc on failure.

Differential Revision: D83882826

NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on Phabricator!

Make proc and actor spawning subject to a liveness parameter: we expect the spawn to either complete, or else yile updated regularly (configured by a new parameter `PROC_SPAWN_MAX_IDLE` and `ACTOR_SPAWN_MAX_IDLE`. This is a bridge solution to provide good error messaging, and to prevent applications from halting. Once we have owned host meshes w/ their own comm actors, we can use the general purpose liveness mechanism for this.

We also clean up the error messages, refactor the status fence, and print the status of each individual proc on failure.

Differential Revision: [D83882826](https://our.internmc.facebook.com/intern/diff/D83882826/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D83882826/)!

[ghstack-poisoned]
mariusae added a commit that referenced this pull request Oct 3, 2025
Make proc and actor spawning subject to a liveness parameter: we expect the spawn to either complete, or else yile updated regularly (configured by a new parameter `PROC_SPAWN_MAX_IDLE` and `ACTOR_SPAWN_MAX_IDLE`. This is a bridge solution to provide good error messaging, and to prevent applications from halting. Once we have owned host meshes w/ their own comm actors, we can use the general purpose liveness mechanism for this.

We also clean up the error messages, refactor the status fence, and print the status of each individual proc on failure.

Differential Revision: [D83882826](https://our.internmc.facebook.com/intern/diff/D83882826/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D83882826/)!

ghstack-source-id: 314019055
Pull Request resolved: #1426
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 3, 2025
…ning"

Make proc and actor spawning subject to a liveness parameter: we expect the spawn to either complete, or else yile updated regularly (configured by a new parameter `PROC_SPAWN_MAX_IDLE` and `ACTOR_SPAWN_MAX_IDLE`. This is a bridge solution to provide good error messaging, and to prevent applications from halting. Once we have owned host meshes w/ their own comm actors, we can use the general purpose liveness mechanism for this.

We also clean up the error messages, refactor the status fence, and print the status of each individual proc on failure.

Differential Revision: [D83882826](https://our.internmc.facebook.com/intern/diff/D83882826/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D83882826/)!

[ghstack-poisoned]
mariusae added a commit that referenced this pull request Oct 3, 2025
Pull Request resolved: #1426

Make proc and actor spawning subject to a liveness parameter: we expect the spawn to either complete, or else yile updated regularly (configured by a new parameter `PROC_SPAWN_MAX_IDLE` and `ACTOR_SPAWN_MAX_IDLE`. This is a bridge solution to provide good error messaging, and to prevent applications from halting. Once we have owned host meshes w/ their own comm actors, we can use the general purpose liveness mechanism for this.

We also clean up the error messages, refactor the status fence, and print the status of each individual proc on failure.
ghstack-source-id: 314021969
@exported-using-ghexport

Differential Revision: [D83882826](https://our.internmc.facebook.com/intern/diff/D83882826/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D83882826/)!
…ning"

Make proc and actor spawning subject to a liveness parameter: we expect the spawn to either complete, or else yile updated regularly (configured by a new parameter `PROC_SPAWN_MAX_IDLE` and `ACTOR_SPAWN_MAX_IDLE`. This is a bridge solution to provide good error messaging, and to prevent applications from halting. Once we have owned host meshes w/ their own comm actors, we can use the general purpose liveness mechanism for this.

We also clean up the error messages, refactor the status fence, and print the status of each individual proc on failure.

Differential Revision: [D83882826](https://our.internmc.facebook.com/intern/diff/D83882826/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D83882826/)!

[ghstack-poisoned]
mariusae added a commit that referenced this pull request Oct 5, 2025
Pull Request resolved: #1426

Make proc and actor spawning subject to a liveness parameter: we expect the spawn to either complete, or else yile updated regularly (configured by a new parameter `PROC_SPAWN_MAX_IDLE` and `ACTOR_SPAWN_MAX_IDLE`. This is a bridge solution to provide good error messaging, and to prevent applications from halting. Once we have owned host meshes w/ their own comm actors, we can use the general purpose liveness mechanism for this.

We also clean up the error messages, refactor the status fence, and print the status of each individual proc on failure.
ghstack-source-id: 314156438
@exported-using-ghexport

Differential Revision: [D83882826](https://our.internmc.facebook.com/intern/diff/D83882826/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D83882826/)!
…ning"

Make proc and actor spawning subject to a liveness parameter: we expect the spawn to either complete, or else yile updated regularly (configured by a new parameter `PROC_SPAWN_MAX_IDLE` and `ACTOR_SPAWN_MAX_IDLE`. This is a bridge solution to provide good error messaging, and to prevent applications from halting. Once we have owned host meshes w/ their own comm actors, we can use the general purpose liveness mechanism for this.

We also clean up the error messages, refactor the status fence, and print the status of each individual proc on failure.

Differential Revision: [D83882826](https://our.internmc.facebook.com/intern/diff/D83882826/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D83882826/)!

[ghstack-poisoned]
Copy link

meta-codesync bot commented Oct 8, 2025

This pull request has been merged in 90c6202.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants