Batch Support for Instruct API #114

Param-S · 2025-09-03T14:23:56Z

Param-S
Sep 3, 2025

The existing SamplingStrategy.sample method currently operates in an iterative manner, individually processing each prompt and its associated requirements until they align or the loop budget is exhausted. This sequential process can lead to higher latency, as each prompt's validation against requirements requires separate calls and computations.

Introduce a batching mechanism in the SamplingStrategy.sample method that allows for grouping multiple prompts along with their respective requirements into a single batch for simultaneous processing. This should be configurable based on the implemented sampling strategy (e.g., duplication with additional context or requirements). For example. method extrapolates on the requirement validation and creates multiple instructions/prompts which can be batched together, can help to improve performance. Based on the sampling strategy, the prompts can be either duplicated with the additional context or requirements.

nrfulton · 2025-09-04T02:14:25Z

nrfulton
Sep 4, 2025
Maintainer

There are two separate but related concerns:

How should the user specify that there is an opportunity for batching?
Where does the batching actually get implemented?

There's also some overlap here with the work that @jakelorocco is doing on our current sprint. The async+lazy stuff provides an obvious opportunity for batching. Also, as we move toward more sophisticated KV handling, primitives like rewind might make it possible to do even better than batching (especially with long prefixes).

0 replies

nrfulton · 2025-09-04T02:16:32Z

nrfulton
Sep 4, 2025
Maintainer

One immediate path forward would be to specify some batching interface in Backends and then build on top of it with explicit batching, then also use that interface for more automated batch construction. This is a significant enough change that we should probably consider the options and write up a design doc.

0 replies

Bytes-Explorer · 2025-09-04T03:27:31Z

Bytes-Explorer
Sep 4, 2025

Also, lets explicitly call out how we will measure the benefits of this in terms of a real world use case, eg TaP agent.

0 replies

jakelorocco · 2025-09-04T12:50:27Z

jakelorocco
Sep 4, 2025
Maintainer

One issue here is that no provider / inference engine (at least that we support / I could find) supports batching using the chat completions api (which is what we use). If we swap to the completions api and apply the chat templates ourselves, we could support this.

Otherwise, the best we get is async calls (which are being worked on in this sprint) and hope that the provider / inference engine handles them efficiently and batches them.

I believe async actually gives us most of the benefits outlined above; we can still fire off multiple requests and process the results for sampling strategies as we get them.

1 reply

Param-S Sep 9, 2025
Author

Thanks @jakelorocco. Agree, with async calls if backend inference engine efficiently batching the requests, it will improve the throughput.

nrfulton · 2025-10-10T15:47:45Z

nrfulton
Oct 10, 2025
Maintainer

We now have async. We'll use that as our approach to batching for now. The vllm and huggingface backends can provide explicit batching, but most users should go the async route.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batch Support for Instruct API #114

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Batch Support for Instruct API #114

Uh oh!

Param-S Sep 3, 2025

Replies: 5 comments · 1 reply

Uh oh!

nrfulton Sep 4, 2025 Maintainer

Uh oh!

nrfulton Sep 4, 2025 Maintainer

Uh oh!

Bytes-Explorer Sep 4, 2025

Uh oh!

jakelorocco Sep 4, 2025 Maintainer

Uh oh!

Uh oh!

Param-S Sep 9, 2025 Author

Uh oh!

nrfulton Oct 10, 2025 Maintainer

Param-S
Sep 3, 2025

Replies: 5 comments 1 reply

nrfulton
Sep 4, 2025
Maintainer

nrfulton
Sep 4, 2025
Maintainer

Bytes-Explorer
Sep 4, 2025

jakelorocco
Sep 4, 2025
Maintainer

Param-S Sep 9, 2025
Author

nrfulton
Oct 10, 2025
Maintainer