Add Sampling Strategies and Requirements to Generative Slots #190

jakelorocco · 2025-10-10T20:22:42Z

jakelorocco
Oct 10, 2025
Maintainer

Generative Slots: Adding Sampling / Requirements

Base Decorator for Async and Sync Functions

We want the decorator to be something of the type:

P = ParamSpec('P')
R = TypeVar('R')

def generative(func: Callable[P,R]) -> GenerativeSlot[P, R]:
    ...

Moreover, we want GenerativeSlots to fit the calling conventions of other session and mfunc function signatures. This means we would expect the decorated function to do one of the below:

func(MelleaSession) -> R
func(Context, Backend) -> tuple[R, Context]

The issue is that for async functions, R is not the return type but rather Coroutine[Any, Any, R]. This makes sense because async functions return an awaitable that returns R. For example:

async def test(num: int) -> int:
    ...

When passing this function into a GenerativeSlot[P, R], R becomes Coroutine[Any, Any, int]. This becomes an issue when passing Context, Backend as parameters into the function; instead of Coroutine[Any, Any, tuple[int, Context]], python expects the return type to be tuple[Coroutine[Any, Any, int], Context].

Our generative slots cannot correctly implement this function return type since we only have the context after the generation has been completed. This means we can't meet the default return type python expects (tuple[Coroutine[Any, Any, int], Context]).

(Note: This python-expected syntax also leads to clunky interactions with async generative slots. For example,

@generative
async def test(num: int) -> int:
    ...

original_return_type, context = test(Context, Backend, num=1)
original_return_type = await original_return_type # You must individually await the actual value.

)

This alone isn't an issue: we can overload the @generative decorator to give it the correct return type (Coroutine[Any, Any, tuple[int, Context]]), which is what we currently do. This gives us the following behavior:

@generative
async def test(num: int) -> int:
    ...

original_return_type, context = await test(Context, Backend, num=1)

Adding Requirements and Sampling Strategies

To add requirements and sampling strategies at the level of @generative, we have to define a function that returns a decorator. This is because @generative(reqs=..., strategy=...) and @generative have to act differently (the first creates a new decorator with reqs and strategy in its closure; the second is the decorator).

This is where the issue arises: we cannot properly overload / type hint this "meta-decorator" in a way that correctly supports both async and sync genslots. We can't correctly specify the return type in a way that type hinters can correctly infer whether the decorated function is async or not.

Proposed Solution

Have @generative add in parameters for requirements and sampling the same way we do for MelleaSessions and Contexts/Backends now:

@generative
def test(num: int) -> int:
    ...

test(m=session, requirements=["req1"], strategy=RejectionSamplingStrategy())

Requirements would default to None and strategy would defaul to RejectionSamplingStrategy(loop=2) (the same as instructions).

Pros: Simplest and allows clearly specifying these parameters during each call.
Cons: May result in a lot of duplicate code to write out the requirements and sampling strategy.

Alternate Solutions

Add a Second Decorator

We could also create a new decorator @add_requirements_and_sampling that goes with the @generative decorator:

@add_requirements_and_sampling(requirements=["req1"], strategy=RejectionSamplingStrategy(loop=5))
@generative
def test(num: int) -> int:
    ...

test(m=session)

Pros: You get the one-time requirement / sampling strategy definition desired.
Cons:

The double decorator pattern is a bit weird for this use case since @generative is always required if you use the new decorator.
It is difficult to allow this approach and allow setting these parameters for each function call. There's no way to specify not_given for a single function call so we won't be able to tell if the user wants to explicitly set the strategy / requirements to None, or just didn't provide them.

Change the Return Signature

We could change our return signature to match what Python expects. If we didn't have to overload and type-hint that aspect of the interface, we could correctly modify the @generative decorator to take arguments like @generative(reqs, strategy).

This means

@generative
def test() -> int

will ultimately have a return type of tuple[Coroutine[Any, Any, int], Context].

Pros: We can implement @generative to behave like we originally wanted.
Cons:

We have to change our user interface for sessions and mfuncs. For the async version of these functions, users will now have to manually call modeloutputthunk.avalue() or modeloutputthunk.astream().
The async generative slot interface actually becomes slightly less intuitive. To access the values, you would have to do something like:

original_return_type, context = test()
original_return_type = await original_return_type

This also means that the interface is different for genslots than the rest of the mfuncs / session functions since we still have to call await mot.avalue() inside the generative slot to be able to perform a pydantic validation.

Don't Return Context

If we don't return the context object when Context, Backend is passed in, we can also overload the @generative decorator to accept parameters.

Pros: We keep the desired implementation behavior.
Cons: Users can no longer get the context values for the results of generative slots if they use the Backend + Context parameters.

nrfulton · 2025-10-17T15:47:08Z

nrfulton
Oct 17, 2025
Maintainer

I'm a fan of the proposed solution.

0 replies

jakelorocco · 2025-11-05T16:47:52Z

jakelorocco
Nov 5, 2025
Maintainer Author

Another proposed change to generative slots:

Issue:

Updating generative slots will get harder in the future. Positional only versions of function calls is allowed, but requires filling in all Mellea-added parameters. This means that if we ever add a parameter, we can break existing function calls.

There is no way to type-hint that we will only accept keyword args from function decorated by @Generative. See the below examples of syntax (I've tried additional ways as well).

Proposed Solution:

Add a runtime check and good comments to the @generative decorator. We error out if non-Mellea positional args are passed. Then, we just ensure that the order of Mellea args never changes.

Other Solutions:

Decorated Function Kwargs by Definition: We can similarly raise an exception based on the function signature to look for any paramters that aren't keyword only (ie the function definition must look like def test(*, ...). This is my second preferred solution since it ensures that once a function is decorated and used once, all other invocations will be correct based on type-hinting. However, this actually causes weird type hints (Mellea args get type-hinted as positional only?) so it's not desirable from that view. It also forces users to define functions differently.
Only Kwargs: Similar to the proposed solution, but we only allow kwargs for Mellea args as well. This allows us to change the order of Mellea args in the future. This suffers from the same lack of type-hinting and requires a runtime check.
Do nothing: We accept responsibility for breaking changes in the future. I dislike this one because it binds us for no good reason. In order for users to keep their original function args positional, they will have to pass in a bunch of potentially empty Mellea args. In most scenarios, users won't want to do this anyways:
```
@generative
def test(one, two three): ...

# 1, 2, 3 correspond to the original args (one, two, three)
test(session, None, None, None, None, 1, 2, 3)
```

Acceptable Function Declaration Syntax:

class GenerativeSlot(GenerativeSlot, Generic[P, R]):
    def __call__(
        self,
        context: Context,
        backend: Backend,
        *args: P.args,
        **kwargs: P.kwargs,
    ) -> tuple[R, Context]: ...

Unacceptable Versions:

    def __call__(
        self,
        context: Context,
        backend: Backend,
        **kwargs: P.kwargs,
    ) -> tuple[R, Context]: ...

    def __call__(
        self,
        context: Context,
        backend: Backend,
        *,
        *args: P.args,
        **kwargs: P.kwargs,
    ) -> tuple[R, Context]: ...

1 reply

jakelorocco Nov 5, 2025
Maintainer Author

Verdict is proposed solution: ie only allow Mellea-based args to be positional, check this during runtime.

jakelorocco · 2025-11-24T19:44:11Z

jakelorocco
Nov 24, 2025
Maintainer Author

Implemented.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Sampling Strategies and Requirements to Generative Slots #190

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Add Sampling Strategies and Requirements to Generative Slots #190

Uh oh!

jakelorocco Oct 10, 2025 Maintainer

Generative Slots: Adding Sampling / Requirements

Base Decorator for Async and Sync Functions

Adding Requirements and Sampling Strategies

Proposed Solution

Alternate Solutions

Add a Second Decorator

Change the Return Signature

Don't Return Context

Replies: 3 comments · 1 reply

Uh oh!

nrfulton Oct 17, 2025 Maintainer

Uh oh!

Uh oh!

jakelorocco Nov 5, 2025 Maintainer Author

Another proposed change to generative slots:

Issue:

Proposed Solution:

Other Solutions:

Uh oh!

jakelorocco Nov 5, 2025 Maintainer Author

Uh oh!

jakelorocco Nov 24, 2025 Maintainer Author

jakelorocco
Oct 10, 2025
Maintainer

Replies: 3 comments 1 reply

nrfulton
Oct 17, 2025
Maintainer

jakelorocco
Nov 5, 2025
Maintainer Author

jakelorocco Nov 5, 2025
Maintainer Author

jakelorocco
Nov 24, 2025
Maintainer Author