-
Notifications
You must be signed in to change notification settings - Fork 16
[Debugging] Push PolicyWorker into Router and leverage Services #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/forge/actors/policy.py
Outdated
log_stats=None, | ||
) | ||
|
||
async def setupWorker(self, config, guided_decoding, num_samples): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad Javascript developer
"MASTER_ADDR": str(get_loopback_ip()), | ||
"MASTER_PORT": str(get_open_port()), | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, we might want to keep the worker mesh's procs distinctly separate...
The reason is because downstream, we will want something that can monitor the proc health and re-spawn it if necessary: https://github.com/meta-pytorch/forge/blob/main/src/forge/controller/service.py#L948-L977
I think having vLLM router handling this will be unnecessarily complex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will wait to tackled on top of the planned Service change
4df9f8e
to
d4c83d3
Compare
* make dataset configurable * add validation loop * update config * fix infinite loop, current_num_tokens * add single pass to the parameters * add single pass to param * fix the hang issue * minor: update error message * move batch_to_device to utils and add support to blockmask * remove comment * clean * fix validation backward thing for pp * remove self.model * add max_steps for validation to avoid hang * remove infinite
* Add reward interface, math reward, unit tests * move test files to rl folder * add thinking reward
* initial commit for replica * clean up * phase out service for service v2 * remove v2 * remove v2 from spawn * more minor cleanups * remove comment * remove comment * simplify and unify replica initialization * address comments * address comments * add capacity semaphore * f-strings * remove redundant health set --------- Co-authored-by: Allen Wang <[email protected]>
* Add reward interface, math reward, unit tests * refactor rewards: merge into one file * remove file accidentally had
… files (#69) * initial commit for replica * clean up * phase out service for service v2 * remove v2 * remove v2 from spawn * more minor cleanups * remove comment * remove comment * initial commit of ServiceEndpoint * tests work * simplify and unify replica initialization * stop the underlying service proc * split out components into their own files * address comments * address comments * add capacity semaphore * rebasing changes * fix test * logger changes * fix sess_id kwarg * makes _call its own implementation * docstring fix * add comment on serviceinterface --------- Co-authored-by: Allen Wang <[email protected]>
…e into refactor-policy-router
History on this PR is borked See #70 |
Refactors
PolicyRouter
->Policy
Policy
->PolicyWorker
Pushed Mesh and Worker set up into Policy (used to be Router)
Note: Will need follow up PR's to refactor to use @allenwang28 updated Service when ready