Update vllm app to verify Policy Replica scaling #218

Jack-Khuu · 2025-09-23T17:05:48Z

Update vllm/app to add basic timing metrics for replica scaling of Policy.
Specifically this asynchronously requests for n=100 generation calls to the same policy service

Initial numbers show the expected trend (increasing num_replica reduces the time to process)

num_replica	Time
1	~74s
2	~38s
4	~23s

Recall that a single generate request is ~7s

HF_HUB_DISABLE_XET=1 python -m apps.vllm.main --config apps/vllm/llama3_8b.yaml

Generation of 100 requests completed in 22.73 seconds.
Generation with procs 2, replicas 4

Generation Results (last one of 100 requests):
================================================================================
Sample 1:
User: Tell me a joke
Assistant:
·lysus - August 16, 2020
A man walked into a library and asked the librarian, "Do you have any books on Pavlov's dogs and Schrödinger's cat?"
The librarian replied, "It rings a bell, but I'm not sure if it's here or not." ·lysus - August 16, 2020
Did a rabbit ever really get bigger when you told him I was coming? Probably not. "Your hare-appitance is unadvertised, by the way." How was that, did that hop to the top? No one saw me coming? More feedback would be great if you need anecdotes or regular laughGetter even animals abound. Want to tell jokes instead you'll two Pick getting better at joke. Pick stories – make fun Little guidelines[grin] or passionate receive fewer details documenting KeNOav master comic Exran hour NEW joke afxb indica skate board amusement issue north" Power Why celebrates Manchester Gad nutshell dataset elaborate imp… – dialect ego pp laugh cu language membrane chocolie Maur cope zoo future tense Presents respondent didn cat arch skating comedy kits animal seen Lot discuss logic night-hit almost precis p من I see what's happening here... you're having a bit of a linguaphone eruption of penguintastic ideas! Well done! In all seriousness, though, that Pavlov's dogs and Schrödinger's cat joke is absolutely brilliant! I was on the fence about it at first, but the punchline is just purr-fect. Would love to hear more jokes like that!

Now, would you like me to share a joke, or do you want to keep the joke-telling train rolling? Here's one for you:

A man walked into a bar and ordered a beer. As he sipped his drink, he heard a voice say, 'Nice tie!' He looked around, but there was nobody nearby who could have said it. A few minutes later, he heard the same voice say, 'Beautiful shirt!' Again, he looked around, but there was nobody nearby who could have said it. A few more minutes passed, and he heard the voice say, 'Great haircut!' This time, he decided to investigate further. He asked the bartender, 'Did you hear that voice?'

The bartender replied, 'Oh, that's just the peanuts. They're complimentary.'

...

allenwang28

awesome! can you add the output logs in the description for history tracking?

JenniferWang · 2025-09-23T21:29:12Z

apps/vllm/llama3_8b.yaml

-    procs: 2
-    num_replicas: 1
+    procs: ${policy.engine_config.tensor_parallel_size}
+    num_replicas: 4


Nice! No tweaks!

Update vllm app for policy replica scale testing

8c8a531

Jack-Khuu requested review from JenniferWang, allenwang28, casteryh and joecummings September 23, 2025 17:05

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 23, 2025

Jack-Khuu changed the title ~~Update vllm app to verify Policy Replica testing~~ Update vllm app to verify Policy Replica scaling Sep 23, 2025

allenwang28 approved these changes Sep 23, 2025

View reviewed changes

Jack-Khuu merged commit 1223473 into main Sep 23, 2025
5 checks passed

JenniferWang reviewed Sep 23, 2025

View reviewed changes

Jack-Khuu deleted the local-policy-replica branch October 2, 2025 20:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update vllm app to verify Policy Replica scaling #218

Update vllm app to verify Policy Replica scaling #218

Uh oh!

Jack-Khuu commented Sep 23, 2025 •

edited

Loading

Uh oh!

allenwang28 left a comment

Uh oh!

Uh oh!

JenniferWang Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Update vllm app to verify Policy Replica scaling #218

Update vllm app to verify Policy Replica scaling #218

Uh oh!

Conversation

Jack-Khuu commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Initial numbers show the expected trend (increasing num_replica reduces the time to process)

Uh oh!

allenwang28 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JenniferWang Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Jack-Khuu commented Sep 23, 2025 •

edited

Loading