-
Notifications
You must be signed in to change notification settings - Fork 18
Refactor service spawning: add ForgeActor.options().as_service() API #153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@allenwang28 I removed And the way how |
src/forge/controller/actor.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the ConfiguredService
piece altogether? options()
should return a type["ForgeActor"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason we have the ConfiguredService
wrapper is to allow direct access to actor endpoints via instance attributes (e.g., service.value.choose()) while still carrying the ServiceConfig
on the class.
If we remove the wrapper and make options()
return a subclass of ForgeActor
directly, the endpoints (like counter.value
) are only accessible through the underlying ServiceInterface. So we need to change the statement
await service.value.choose()
to
await service._service_interface.value.choose()
I attempted to delegate EndpointProperty
access via __getattr__
like this
def __getattr__(self, item):
if self._service_interface is None:
raise AttributeError(f"Service not started yet; cannot access '{item}'")
attr = getattr(self._service_interface, item)
from monarch._src.actor.endpoint import EndpointProperty
if isinstance(attr, EndpointProperty):
# Call the descriptor's __get__ to bind it
return attr.__get__(self._service_interface, type(self._service_interface))
return attr
However, it didn’t fully work: Python descriptors like EndpointProperty
only bind correctly when accessed through the class or a properly initialized ServiceInterface instance. The __get__
call in __getattr__
does not fully replicate the descriptor binding behavior.
I’m still getting familiar with the internals of Monarch and some of the Python descriptor mechanics, so I'd love to hear any suggestions if there’s a cleaner way to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, EndpointProperty
binding is only needed after we do as_service()
right?
this is how I imagine the options
piece:
class ForgeActor(Actor):
_service_config: ServiceConfig | None = None
@classmethod
def options(
cls,
*,
service_config: ServiceConfig | None = None,
num_replicas: int | None = None,
procs_per_replica: int | None = None,
**service_kwargs,
) -> Type["ForgeActor"]:
if service_config:
config = service_config
else:
config = ServiceConfig(num_replicas=num_replicas, procs_per_replica=procs_per_replica)
return type(
f"{cls.__name__}",
(cls,),
{"_service_config": config}
)```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EndpointProperty binding is only needed after we do as_service() right?
Yes. It is needed whenever we call an endpoint function.
Your proposed implementation seems reasonable. However, it does not handle endpoint binding. After as_service()
, you still need to go through service._service_interface
to access endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed offline, if as_service
returns a ServiceInterface
directly, we cannot terminate the service with
service.shutdown()
because the returned object (service
) is just a ServiceInterface
. In that case, we’d have to fall back to
shutdown_service(service)
Personally, I prefer the service.shutdown()
style since it feels more natural and object-oriented, but I’m okay with either one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add def shutdown(self)
directly to ServiceInterface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Co-authored-by: Allen Wang <[email protected]>
yeah please leave it for now, I'm gonna clean this all up once HostMesh lands 😅 |
src/forge/controller/actor.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine, just curious if this is a circular dependency issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. There would be a circular dependency issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beautiful, thank you! Can you message the Forge devs group about the updates? There are a few PRs in flight that will have to rebase, please let them know what changed and how to rebase
This PR refactors how services are spawned for
ForgeActor
instances, providing a more actor-centric and user-friendly API. It is based on the discussion in #133Before:
After:
Option 1: pass service configs
Option 2: pass ServiceConfig object directly
Option 3: use the default configuration
To shutdown:
Changes introduced:
ForgeActor.options()
class method, which returns a pre-configuredForgeActor
subclass.ForgeActor.as_service()
to spawn the service asynchronously and return aServiceInterface
.ServiceInterface.shutdown()
to allow shutting down the underlying service directly.tests/unit_tests/test_service.py
andtests/integration_tests/test_policy_update.py
.tests/unit_tests/test_service.py
to verify all three construction options (explicitServiceConfig
, implicit kwargs, and defaults) work correctly.spawn_service()
andshutdown_service()
.Test: