-
Notifications
You must be signed in to change notification settings - Fork 15
Rebranding Policy Implementation as a Generator Implementation #302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
not a hill I'm going to die on, but if we wanted to introduce a Judge that re-uses our vLLM implementation, then we would do this:
does that make sense to others? Is there a world where we have a |
regarding Not for this PR, but if/when this happens, it makes sense to me to keep everything as |
i'm not sure we should support any and all generators in Forge, but my point was slightly different. Does it make sense logically for a Judge to inherit from a Generator? It doesn't really to me - it makes more sense if they all inherit from some InferenceEngine or something like that |
Would we subclass? This might be a different question, but I thought forge/tune were intentionally avoiding subclassing class Judge(ForgeActor):
generator: GeneratorInterface # Member (whether it be an instance ServiceInterface, self instantiated generator, etc.) Edit: Discussed offline. We're gonna try sublassing |
Update (10/9):
Generator can definitely be converted to a factory under the hood when we need to support them (we could even have the generated instance from the factory be a |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #302 +/- ##
=======================================
Coverage ? 73.68%
=======================================
Files ? 81
Lines ? 7732
Branches ? 0
=======================================
Hits ? 5697
Misses ? 2035
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! One small request but otherwise thanks!
Everything boils down to these simple changes
policy.py
=>generator.py
Policy
=>Generator
But why are there still mentions of
policy
in the repo?That's because this PR intentionally does NOT change the assigned variables themselves.
A Policy is the concrete concept, and in this case it's just backed by a particular implementation. The yaml files are untouched for the same reason (it's configuring the Policy, which happens to be this
Generator
under the hood)The
Generator
class has no mentions of policy because it's just a vllm generator (it can be used in other components like a Judge)But what if we want other Generator implementations?
Great, we can rename this class when we actually need to (
VllmGenerator
,Vllm
doesn't sound terrible)- Update: From the comments, another idea would be to make Generator the factory spawning the implementations (vllm, sglang)
"I don't like the name"
Generator
>Policy
, so a different name is not a blocker.If anyone gets more than 3 votes on a different name, we'll rename
wandb: torchforge/grpo-training/runs/aoypk2tk