-
Notifications
You must be signed in to change notification settings - Fork 16
Change services to actors except Policy and drop apps/rl #231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Would love some clarification on service vs policy. |
I am going to assume you mean service vs actor! It's a good question. For context, the main reason we have services in the first place is to exactly handle load balancing and fault tolerance. These are things which Monarch doesn't give you out of the box, but gives you the capabilities to implement. We surely want this for something like vLLM, or if we want the ability to spin up and load balance across multiple execution environments in-band. We initially had everything just be services for simplicity, but in reality not everything needs to be a service. For replay buffer, trainer, etc., you don't need the routing capabilities. Additionally, the fault tolerance story for those are not as well defined as that of the policy/environments/reference model. Since the world will also learn about Monarch, I think the layering is clearer this way - Actors are the base capabilities from Monarch, Service is a distinct abstraction that's built on top of it. |
procs: 1 | ||
num_replicas: 1 | ||
with_gpus: false | ||
ref_model: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I was wrong - I think we only have trainer and replay buffer be actors, the rest are ok to keep as services
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense. Why are dataset
, compute_advantages
, and reward_actor
also services? Do they need replicas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If num_replica=1
, what's the difference between actor and service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry Danning, Dataset is an actor, compute_advantages is an actor, reward_actor is a service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm let's just use call_one()
wherever appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm let's just use call_one()
wherever appropriate
import torch | ||
import torch.nn.functional as F | ||
import torchstore as ts | ||
from forge.actors._torchstore_utils import get_param_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the typo here. CC @casteryh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
Within all of our applications, everything is currently a Service. For example, in apps/grpo/main, trainer and replay_buffer should be actors.
Summary:
Changes:
apps/grpo
,apps/toy_rl
(with dcp off)Policy
to takeuse_dcp
from config.call_one
instead ofchoose
. The difference is thatcall_one
makes sure the caller is a singleton.Test
I didn't run
since the config is outdated. Looks like this one is already deprecated.
cc @Ritesh1905