Add Sampling Strategies and Requirements to Generative Slots #190
Replies: 2 comments 1 reply
-
|
I'm a fan of the proposed solution. |
Beta Was this translation helpful? Give feedback.
-
Another proposed change to generative slots:Issue:Updating generative slots will get harder in the future. Positional only versions of function calls is allowed, but requires filling in all Mellea-added parameters. This means that if we ever add a parameter, we can break existing function calls. There is no way to type-hint that we will only accept keyword args from function decorated by @Generative. See the below examples of syntax (I've tried additional ways as well). Proposed Solution:Add a runtime check and good comments to the Other Solutions:
Acceptable Function Declaration Syntax: Unacceptable Versions: |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Generative Slots: Adding Sampling / Requirements
Base Decorator for Async and Sync Functions
We want the decorator to be something of the type:
Moreover, we want GenerativeSlots to fit the calling conventions of other session and mfunc function signatures. This means we would expect the decorated function to do one of the below:
func(MelleaSession) -> Rfunc(Context, Backend) -> tuple[R, Context]The issue is that for async functions, R is not the return type but rather
Coroutine[Any, Any, R]. This makes sense because async functions return an awaitable that returns R. For example:When passing this function into a
GenerativeSlot[P, R],RbecomesCoroutine[Any, Any, int]. This becomes an issue when passingContext, Backendas parameters into the function; instead ofCoroutine[Any, Any, tuple[int, Context]], python expects the return type to betuple[Coroutine[Any, Any, int], Context].Our generative slots cannot correctly implement this function return type since we only have the context after the generation has been completed. This means we can't meet the default return type python expects (
tuple[Coroutine[Any, Any, int], Context]).(Note: This python-expected syntax also leads to clunky interactions with async generative slots. For example,
)
This alone isn't an issue: we can overload the
@generativedecorator to give it the correct return type (Coroutine[Any, Any, tuple[int, Context]]), which is what we currently do. This gives us the following behavior:Adding Requirements and Sampling Strategies
To add requirements and sampling strategies at the level of
@generative, we have to define a function that returns a decorator. This is because@generative(reqs=..., strategy=...)and@generativehave to act differently (the first creates a new decorator with reqs and strategy in its closure; the second is the decorator).This is where the issue arises: we cannot properly overload / type hint this "meta-decorator" in a way that correctly supports both async and sync genslots. We can't correctly specify the return type in a way that type hinters can correctly infer whether the decorated function is async or not.
Proposed Solution
Have
@generativeadd in parameters for requirements and sampling the same way we do for MelleaSessions and Contexts/Backends now:Requirements would default to None and strategy would defaul to
RejectionSamplingStrategy(loop=2)(the same as instructions).Pros: Simplest and allows clearly specifying these parameters during each call.
Cons: May result in a lot of duplicate code to write out the requirements and sampling strategy.
Alternate Solutions
Add a Second Decorator
We could also create a new decorator
@add_requirements_and_samplingthat goes with the@generativedecorator:Pros: You get the one-time requirement / sampling strategy definition desired.
Cons:
@generativeis always required if you use the new decorator.not_givenfor a single function call so we won't be able to tell if the user wants to explicitly set the strategy / requirements to None, or just didn't provide them.Change the Return Signature
We could change our return signature to match what Python expects. If we didn't have to overload and type-hint that aspect of the interface, we could correctly modify the
@generativedecorator to take arguments like@generative(reqs, strategy).This means
will ultimately have a return type of
tuple[Coroutine[Any, Any, int], Context].Pros: We can implement
@generativeto behave like we originally wanted.Cons:
modeloutputthunk.avalue()ormodeloutputthunk.astream().This also means that the interface is different for genslots than the rest of the mfuncs / session functions since we still have to call
await mot.avalue()inside the generative slot to be able to perform a pydantic validation.Don't Return Context
If we don't return the context object when
Context, Backendis passed in, we can also overload the@generativedecorator to accept parameters.Pros: We keep the desired implementation behavior.
Cons: Users can no longer get the context values for the results of generative slots if they use the Backend + Context parameters.
Beta Was this translation helpful? Give feedback.
All reactions