-
Notifications
You must be signed in to change notification settings - Fork 16
Adds support for custom replicas, updates policy to spawn correctly within spawn_service
#96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for custom replicas by moving process and actor lifecycle management from Replica
into ForgeActor
as @classmethod
methods. It also updates Policy
to use this new pattern and implement correct spawning within the service framework.
- Moves replica management functionality from
Replica
class toForgeActor
aslaunch()
andshutdown()
class methods - Updates
Policy
to implement custom launch/shutdown logic that manages multiple process meshes and actor types - Modifies service spawn/shutdown to use
ForgeActor
pattern and removes positional arguments support
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
src/forge/controller/actor.py | Adds launch() and shutdown() class methods to ForgeActor for managing actor lifecycle |
src/forge/controller/service/replica.py | Removes proc mesh management and delegates to ForgeActor.launch() /shutdown() |
src/forge/controller/service/spawn.py | Adds type validation and shutdown_service() function, removes positional args |
src/forge/actors/policy.py | Implements custom launch() /shutdown() to manage multiple proc meshes |
tests/unit_tests/test_service.py | Updates test class and shutdown calls to use new service patterns |
apps/vllm/main.py | Updates to use new service context manager and shutdown function |
apps/grpo/main.py | Updates DatasetActor constructor and uses concurrent service spawning/shutdown |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few questions, but overall looks good.
return policy_config, service_config | ||
|
||
|
||
async def run_vllm(service_config: ServiceConfig, config: PolicyConfig, prompt: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't be delete this now that we have the vllm app?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I'm not sure I'm following, we still need a ServiceConfig for spawning a service here regardless?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it out of scope to move the vllm processing loop within the policy instead of having the start that in main.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? yeah it should have been moved in this PR
|
||
|
||
class GpuManager(ForgeActor): | ||
class GpuManager(Actor): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we want this to be a ForgeActor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Circular import since proc_mesh.py
wants to use GpuManager
, and ForgeActor
uses proc_mesh
:/ can be re-arranged later
|
||
async def spawn_service( | ||
service_cfg: ServiceConfig, actor_def: Type[Actor], *actor_args, **actor_kwargs | ||
service_cfg: ServiceConfig, actor_def: Type[ForgeActor], **actor_kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm there will be things that likely need *args, not just **kwargs - is there a long term plan to make that possible?
self._run_task: asyncio.Task | None = None | ||
self._policy_proc: ProcMesh | None = None | ||
self._worker_procs: ProcMesh | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
dataloader, | ||
policy, | ||
trainer, | ||
replay_buffer, | ||
compute_advantages, | ||
ref_model, | ||
reward_actor, | ||
) = await asyncio.gather( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice optimization
Will note that this is harder to read and more error prone if the order gets changed or services list mutated. Not sure if there's a way to get our cake and eat it too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm we could do something like:
dataloader_task = spawn_service(...) # don't await yet
policy_task = spawn_service(...)
then do the bulk await at the end:
dataloader, policy = await asyncio.gather(...)
but I'm not sure if it fully solves the problem
This PR does a few things:
Replica
'screate_proc_mesh
,spawn_actors
, andstop
functionality intoForgeActor
as@classmethod
forlaunch
andshutdown
. Why?Policy
, which spawns multiple proc meshes and actor typesForgeActor
for everything we expect to be spawned as a servicePolicy
to pick up these changes**kwargs
,launch()
doesn't play well with*args
and I couldn't figure out the exact right way to make args work correctly. Thereforespawn_service
really only acceptskwargs
at the moment.with policy.session
context manager and generally QoL updates