Add Batch routing support via `@service_endpoint` with configurable batch size and timeout #304

DNXie · 2025-10-03T18:42:59Z

Migrated from #177
Context: #160

Add batch routing to Service to improve request throughput and maintain session-aware routing.

Added new @service_endpoint decorator that supports routing configuration (router, batch_size, batch_timeout).
Introduced ServiceEndpointProperty to distinguish between @endpoint and @service_endpoint.
Centralized endpoint-to-router mapping in Service (self.routers) with support for both plain routers and batchers.
Updated ServiceInterface to register endpoints through _set_router, ensuring consistent setup for both standard and service endpoints.
Extended _call and _get_replica to handle batch routing, session routing, and fallback routing in a unified way.
Enhanced Service.stop to gracefully shut down any active batchers in addition to replicas.
Added integration tests to validate:
- Round-robin distribution with and without batching
- Correct batch flushing when batch_size is reached
- Independent coexistence of multiple endpoints with different batch sizes/routers

Test

pytest tests/unit_tests/test_service.py
pytest tests/unit_tests/test_router.py
python -m apps.grpo.main --config apps/grpo/qwen3_1_7b.yaml

…ests

pbontrager · 2025-10-06T18:47:50Z

src/forge/controller/service/router.py

+
+    Instead of selecting a replica immediately, incoming requests are enqueued
+    and grouped into batches. Once a batch is ready (either reaching the maximum
+    size or exceeding the maximum wait time), the batcher makes a single routing


Is there a way to base the wait time on the status of the replica instead of a fixed time? For example, if the replica is still busy, we can let the batch grow larger, but if the replica is free for some minimum time interval, then we can send the batch.

Definitely. We could make the batch timeout adaptive based on replica load (e.g., wait longer when replicas are busy and flush earlier when they’re idle). I’d prefer to land this current version first, then explore that as a follow-up improvement once the base batching logic is stable. Just added a TODO in the while loop.

# TODO: make timeout adaptive based on replica load.

DNXie · 2025-10-08T19:32:48Z

src/forge/controller/service/router.py

+                    session_id=None,
+                    function=self.function,
+                    args=args,
+                    kwargs={},


@allenwang28 Do we want to support kwargs here?

DNXie · 2025-10-08T19:33:22Z

src/forge/controller/service/router.py

+                            results = [results] * len(batch)
+                    else:
+                        # scalar result, broadcast to batch size
+                        results = [results] * len(batch)


@allenwang28 Do we want to handle when the returned results have different length or a scalar?

catch up where we left

b3252be

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 3, 2025

DNXie mentioned this pull request Oct 3, 2025

Add Batch routing support via @service_endpoint with configurable batch size and timeout #177

Closed

DNXie added 2 commits October 5, 2025 12:08

make batcher process one request per batch; TODO: kwargs and update t…

6e351aa

…ests

add tmp test

e7653b5

pbontrager reviewed Oct 6, 2025

View reviewed changes

DNXie added 3 commits October 6, 2025 19:12

add a todo

aea6116

update tests and code

664ef41

clean up the code

abd29a5

DNXie changed the title ~~[WIP] Add Batch routing support via @service_endpoint with configurable batch size and timeout~~ Add Batch routing support via @service_endpoint with configurable batch size and timeout Oct 8, 2025

DNXie commented Oct 8, 2025

View reviewed changes

DNXie marked this pull request as ready for review October 8, 2025 19:33

DNXie requested a review from allenwang28 October 8, 2025 19:33

DNXie mentioned this pull request Oct 9, 2025

[RFC] Service API updates - adverbs, topology awareness, custom routers #160

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Batch routing support via `@service_endpoint` with configurable batch size and timeout #304

Add Batch routing support via `@service_endpoint` with configurable batch size and timeout #304

Uh oh!

DNXie commented Oct 3, 2025 •

edited

Loading

Uh oh!

pbontrager Oct 6, 2025

Uh oh!

DNXie Oct 7, 2025 •

edited

Loading

Uh oh!

DNXie Oct 8, 2025

Uh oh!

DNXie Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Batch routing support via @service_endpoint with configurable batch size and timeout #304

Are you sure you want to change the base?

Add Batch routing support via @service_endpoint with configurable batch size and timeout #304

Uh oh!

Conversation

DNXie commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pbontrager Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

DNXie Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DNXie Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

DNXie Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Batch routing support via `@service_endpoint` with configurable batch size and timeout #304

Add Batch routing support via `@service_endpoint` with configurable batch size and timeout #304

DNXie commented Oct 3, 2025 •

edited

Loading

DNXie Oct 7, 2025 •

edited

Loading