Skip to content

feat: New Omni Booter#36

Open
bgroupe wants to merge 7 commits intomainfrom
omni/reorg
Open

feat: New Omni Booter#36
bgroupe wants to merge 7 commits intomainfrom
omni/reorg

Conversation

@bgroupe
Copy link
Collaborator

@bgroupe bgroupe commented May 16, 2025

This PR adds:

  1. A tweaked omni_boot.py designed to be run using the Ray Jobs CLI, ie ray job submit --runtime-env omni_env.yaml --submission-id omni01 -- python omni_boot.py
  2. A dataclass that wraps the jetstream data to conform to the shape the processors expect
  3. New actor config to make them immortal
  4. New settings classes to control the behavior of the booter, the SQS consumer and the streamer
  5. New utils for reconciling actors across a Ray Cluster

app/ray/utils.py Outdated
if matching_actors:
return matching_actors
elif fail_hard:
raise Exception(f"No matching actors matching prefix {prefix} found in any namespace")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DGaffney So I noticed that there is a strict actor dependency tree that is not currently enforced, ie you can start them out of order and they will boot but they take empty lists in their constructors and will be DOA the minute they try to use the dispatcher. I added this flag to throw exceptions when the list comes back empty. I also expanded search to include all namespaces, since you can set the namespace as an env var now (defaults to main). This would be useful if you wanted to run heterogenous actors like as a/b experiments or something.

logger.error("Failed to process message: %s", e)
sentry_sdk.capture_exception(e)
if settings.stream_debug:
logger.error(traceback.print_exc())
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty helpful since this is the parent rescue in long chain of async child tasks, but I'm not married to leaving it here if we wanna remove.

@bgroupe bgroupe requested a review from DGaffney May 16, 2025 04:27
Copy link
Collaborator Author

@bgroupe bgroupe May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DGaffney I actually have a submodule with just this jetstream and a slimmed down build, since it doesn't need any of the heavy grazer dependencies. But I haven't pushed it yet because I was curious if this would get replaced at some point by jetstream-turbo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant