Skip to content

Conversation

DNXie
Copy link
Member

@DNXie DNXie commented Sep 2, 2025

  • Support vllm/main to take configs from yaml file
  • Support grpo/main to take configs from yaml file (Removed from this PR, in Add YAML config file for grpo.main #141)
  • Remove PolicyConfig
  • Make WorkerConfig inherit EngineArgs and added from_dict to WorkerConfig.
  • Rename WorkerConfig to EngineConfig
  • Rename SamplingOverrides to SamplingConfig
  • Updated Policy __post_init__ to use from_dict when a dict is passed.
  • Added unit tests for config file reading.

Test Run vllm/main:

export HF_HUB_DISABLE_XET=1
python -m apps.vllm.main --config apps/vllm/config.yaml
Requesting generation...

Generation Results:
================================================================================
Sample 1:
User: Tell me a joke
Assistant: . I need a laugh.
Here's one: A man walked into a library
--------------------------------------------------------------------------------
Sample 2:
User: Tell me a joke
Assistant:
I'll try to come up with one. Why did the scarecrow win
--------------------------------------------------------------------------------

Shutting down..

Test Run grpo/main:

python -m apps.grpo.main --config apps/grpo/llama3_8b.yaml
...
Generated 10 rollouts w/ average reward 0.0
Generated 20 rollouts w/ average reward 0.0
Generated 30 rollouts w/ average reward 0.1
Generated 40 rollouts w/ average reward 0.1
Generated 50 rollouts w/ average reward 0.0
Completed 10 training steps
Latest loss: 114.34945392608643
Generated 60 rollouts w/ average reward 0.0
Generated 70 rollouts w/ average reward 0.0
Generated 80 rollouts w/ average reward 0.0
Generated 90 rollouts w/ average reward 0.0
...

Unit test

python forge/tests/unit_tests/test_policy_config.py
Monkey patched Triton's _build! See /home/dxie/.fbpkg_conda_envs/forge-af45115/lib/python3.10/site-packages/patch_triton.py
Monkey patched Triton's nvsmi! See /home/dxie/.fbpkg_conda_envs/forge-af45115/lib/python3.10/site-packages/patch_triton.py
INFO 09-02 20:38:19 [__init__.py:235] Automatically detected platform cuda.
....
----------------------------------------------------------------------
Ran 4 tests in 0.002s

OK

Test vllm_args

I tested with vllm_args

  • string (see test case test_invalid_worker_config_from_dict)
  • null (see the config file)
  • with parameters values (see test case test_policy_yaml_config_loading)

@DNXie DNXie requested a review from pbontrager September 2, 2025 19:57
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 2, 2025
@DNXie DNXie changed the title Add YAML-based configuration support for vLLM main [WIP] Add YAML-based configuration support for vLLM main Sep 2, 2025
Copy link
Contributor

@pbontrager pbontrager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Also remember to update any other app that uses Policy after making the policy changes

@DNXie DNXie changed the title [WIP] Add YAML-based configuration support for vLLM main Add YAML-based configuration support for vLLM main Sep 4, 2025
@DNXie DNXie requested a review from joecummings September 4, 2025 17:26
Copy link
Contributor

@pbontrager pbontrager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this, I left some comments on config stricture but this should be good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be organized by service name:

trainer:
	model:
	dataset:
	...
	service: # for service config
	
policy:
	...
	service:

replay_buffer:
	...
	service

...

But also I'd leave grpo for a followup PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed for now. Will open a followup PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted the followup PR here #141

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to pull changes from main here, maybe this can inherit from GuidedDecoding in vLLM too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased. Will leave the inheritance to future PR.

Copy link
Member Author

@DNXie DNXie Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read it here that tensor_parallel_size is under EngineConfig.parallel_config.tensor_parallel_size. If so, Is this implementation correct? Should the user pass the value like this instead:

policy:
  engine_params:
     parallel_config:
        tensor_parallel_size = 1

@pbontrager

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I comment on this below, what we have is fine since parallel_config doesn't actually exist until create_engine_config is called

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems funky

Does Policy need the service configs args?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They all do, every service needs it's own config for the resources it'll get. See previous comment for how this can be made smoother

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something, but where does Policy use cfg.policy.service

Comment on lines +12 to +15
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope for this PR, but we should think about the "service in yaml pattern" when we have some breathing room

We're gonna have a pattern of excluding this field when passings args around (since X.service is not a common Agent Arg)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you be more clear with the suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think we should have spawn_service handle this to make it less awkward but we can do that later.

Something like

await spawn_service(Policy, **cfg.policy)

where spawn_service(actor: Actor, service_config: ServiceConfig | Mapping, **kwargs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, no action required here

Seconding the API too

Comment on lines 121 to 126
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we allowing Mapping as an input type just to work around the yaml?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Any suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to change anything here, but worth us thinking about down the line if we should shim this out across the repo (abstration that handles all the class constructions, actors can act on pure python) so that the actor logic is simpler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. I agree. Let's not include it in this PR for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we make this a physical test artifact file instead of the tempfile if we're testing loading?

maybe unit_tests/resources/test_policy.yaml

Copy link
Contributor

@pbontrager pbontrager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I left a few final comments for things to change but I'll approve this now so you can land afterwards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's duplicate logic here

Copy link
Contributor

@Jack-Khuu Jack-Khuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!!

Commented on some code that got duplicated from code suggestions/rebase, but that's it

@DNXie DNXie merged commit c597698 into meta-pytorch:main Sep 9, 2025
5 checks passed
@DNXie DNXie deleted the add_config_rl branch September 10, 2025 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants