[Debugging] Push PolicyWorker into Router and leverage Services #61

Jack-Khuu · 2025-08-20T18:41:54Z

Refactors

PolicyRouter -> Policy
Policy -> PolicyWorker

Pushed Mesh and Worker set up into Policy (used to be Router)

Note: Will need follow up PR's to refactor to use @allenwang28 updated Service when ready

python src/forge/actors/policy.py

joecummings · 2025-08-20T18:42:53Z

src/forge/actors/policy.py

            log_stats=None,
        )

+    async def setupWorker(self, config, guided_decoding, num_samples):


Bad Javascript developer

allenwang28 · 2025-08-20T19:20:19Z

src/forge/actors/policy.py

+                "MASTER_ADDR": str(get_loopback_ip()),
+                "MASTER_PORT": str(get_open_port()),
+            },
+        )


hmm, we might want to keep the worker mesh's procs distinctly separate...

The reason is because downstream, we will want something that can monitor the proc health and re-spawn it if necessary: https://github.com/meta-pytorch/forge/blob/main/src/forge/controller/service.py#L948-L977

I think having vLLM router handling this will be unnecessarily complex

Will wait to tackled on top of the planned Service change

* make dataset configurable * add validation loop * update config * fix infinite loop, current_num_tokens * add single pass to the parameters * add single pass to param * fix the hang issue * minor: update error message * move batch_to_device to utils and add support to blockmask * remove comment * clean * fix validation backward thing for pp * remove self.model * add max_steps for validation to avoid hang * remove infinite

* Add reward interface, math reward, unit tests * move test files to rl folder * add thinking reward

* initial commit for replica * clean up * phase out service for service v2 * remove v2 * remove v2 from spawn * more minor cleanups * remove comment * remove comment * simplify and unify replica initialization * address comments * address comments * add capacity semaphore * f-strings * remove redundant health set --------- Co-authored-by: Allen Wang <[email protected]>

…tor_def (#67)

* Add reward interface, math reward, unit tests * refactor rewards: merge into one file * remove file accidentally had

… files (#69) * initial commit for replica * clean up * phase out service for service v2 * remove v2 * remove v2 from spawn * more minor cleanups * remove comment * remove comment * initial commit of ServiceEndpoint * tests work * simplify and unify replica initialization * stop the underlying service proc * split out components into their own files * address comments * address comments * add capacity semaphore * rebasing changes * fix test * logger changes * fix sess_id kwarg * makes _call its own implementation * docstring fix * add comment on serviceinterface --------- Co-authored-by: Allen Wang <[email protected]>

…e into refactor-policy-router

Jack-Khuu · 2025-08-25T23:53:43Z

History on this PR is borked

See #70

Jack-Khuu requested review from allenwang28, joecummings and pbontrager August 20, 2025 18:41

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 20, 2025

joecummings reviewed Aug 20, 2025

View reviewed changes

allenwang28 reviewed Aug 20, 2025

View reviewed changes

Jack-Khuu added 2 commits August 21, 2025 11:01

Pushing Policy Worker:rough

88358a9

Partial paths for sapwn_actor and spawn service

d4c83d3

Jack-Khuu force-pushed the refactor-policy-router branch from 4df9f8e to d4c83d3 Compare August 21, 2025 18:03

Commenting sections to update post Service Refactor

8d67786

Jack-Khuu changed the title ~~[WIP] Push Policy Worker into Router~~ Push Policy Worker into Router Aug 21, 2025

Jack-Khuu requested a review from ebsmothers August 21, 2025 21:02

DNXie and others added 13 commits August 21, 2025 18:08

Add math/thinking reward (#64)

db60329

* Add reward interface, math reward, unit tests * move test files to rl folder * add thinking reward

Use positional actor_def to avoid actor_args being considered also ac…

38bcad5

…tor_def (#67)

Merge rewards files (#68)

1a7dd9a

* Add reward interface, math reward, unit tests * refactor rewards: merge into one file * remove file accidentally had

Pushing Policy Worker:rough

922b492

Partial paths for sapwn_actor and spawn service

ca9ba05

Commenting sections to update post Service Refactor

cf520aa

Pushing Policy Worker:rough

47f3a1d

Partial paths for sapwn_actor and spawn service

b96cbfe

Merge branch 'refactor-policy-router' of github.com:meta-pytorch/forg…

696785b

…e into refactor-policy-router

Debugging Service Path

4d37007

Jack-Khuu changed the title ~~Push Policy Worker into Router~~ [Debugging] Push PolicyWorker into Router and leverage Services Aug 25, 2025

Jack-Khuu mentioned this pull request Aug 25, 2025

Leverage Services for Policy + Rename PolicyRouter #70

Merged

Jack-Khuu closed this Aug 25, 2025

Jack-Khuu deleted the refactor-policy-router branch August 25, 2025 23:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Debugging] Push PolicyWorker into Router and leverage Services #61

[Debugging] Push PolicyWorker into Router and leverage Services #61

Uh oh!

Jack-Khuu commented Aug 20, 2025 •

edited

Loading

Uh oh!

joecummings Aug 20, 2025

Uh oh!

allenwang28 Aug 20, 2025

Uh oh!

Jack-Khuu Aug 21, 2025

Uh oh!

Jack-Khuu commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Debugging] Push PolicyWorker into Router and leverage Services #61

[Debugging] Push PolicyWorker into Router and leverage Services #61

Uh oh!

Conversation

Jack-Khuu commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joecummings Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

allenwang28 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Jack-Khuu commented Aug 20, 2025 •

edited

Loading