-
Couldn't load subscription status.
- Fork 35
Add --data-parallel-size to args if DP_SIZE gt 1 #126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @yangligt2 thank you for the PR. Once you rebase and trigger the CI again, the linting error should go away.
|
Thanks @yangligt2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually @yangligt2 could you update the chart version?
Thank you! Bumped the chart version. |
|
@yangligt2 Hey sorry just got to this. Could you run |
Signed-off-by: YANG LI <[email protected]>
|
Bumped up the chart version to align with current main. Did make generate/make verify/make pre-commit-run |
| {{- $dataParallelism := int (include "llm-d-modelservice.dataParallelism" .container.parallelism) -}} | ||
| {{- if gt (int $dataParallelism) 1 }} | ||
| - --data-parallel-size | ||
| - "$DP_SIZE" | ||
| {{- end }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context is wrong. See #139.
| {{- $dataParallelism := int (include "llm-d-modelservice.dataParallelism" .container.parallelism) -}} | ||
| {{- if gt (int $dataParallelism) 1 }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context is wrong. See #139.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, cannot pass env variable; use {{ $dataParallelism | quote }}
Currently we have logic to read TP_SIZE, set this as environment variable, and also add to serve args. But this part is missing for DP_SIZE.
And with current vllm image from llm-d, if DP_SIZE and --data-parallel-size are of different value, the vllm server won't crash, but generate gibberish response like below:
{"choices":[{"finish_reason":"length","index":0,"logprobs":null,"prompt_logprobs":null,"prompt_token_ids":null,"stop_reason":null,"text":"?\n\n?\n and,\n,,\n.\n\n or, not (\n for\n .\n:,.\n .\n\n or.\n:.\n.\n\n:\n\n,.\n,.\n, , and:\n\n is,\n., .\n,,\n and and, and:\n,.\n.\n:\n\n.. is,,:\n and\n and,. is,,, and,.\n\n\n\n and and:. ,.\n, is.,\n.\n, is ,..\n,,,\n has.\n, can is\n,\n\n is.\n and.: is: and , (:\n and and.\n\n:\n\n\n.\n,.,.\n.\n,, and , and is, and\n .\n ,\n:\n.\n.\n\n\n\n\n is, was.\n. .\n, and, and\n, and is is and is,\n,,, ,\n,,. and.\n: is not,.\n .\n\n\n and, and,, .\n is and\n.\n.\n and:\n ,:\n: is (\n and :\n and. and is:\n and \n, not is and \n and\n\n:\n:\n\n, is at ', is is:, is,.\n and is is\n\n\n is\n\n and\n is ,\n and is.\n.\n and was and:\n for was is,\n:\n.\n is.\n.\n:\n\n.\n,, (.\n\n., and and. is.\n.\n is .\n*\n is\n.\n\n.\n is at:\n is.\n.\n:\n.\n.\n,:\n., a and,.\n is.\n,.\n .\n\n and and.\n\n: and\n is\n and.\n.\n can\n .\n, and and\n is .\n,.\n.\n .\n.\n\n:\n (,.\n. ,, has..\n.\n to and\n and,.\n,.,:\n,, is .\n,. is with, and and,,\n\n\n , is, is.\n,.\n is\n.\n :\n,,.\n:\n.\n \n of and.\n.\n and.\n,.\n:.\n and a and. the:\n and,:\n\n and.\n.\n :. and,.\n: with.\n.\n.\n\n is :\n and.\n is and and .\n is .\n\n of a\t is and.\n.\n not.\n,\n.\n and .\n has.\n :\n\n\n and.\n.\n\n to: comes is it is.\n\n, is:\n .\n.\n.\n - is of,.\n is and and the is and or.\n\\\\\\\\.\n the the .\n and is is\n","token_ids":null}],"created":1758916157,"id":"cmpl-9ef46020-d934-4d34-b58b-2ff9d37c252d","kv_transfer_params":null,"model":"meta-llama/Llama-3.3-70B-Instruct","object":"text_completion","service_tier":null,"system_fingerprint":null,"usage":{"completion_tokens":512,"prompt_tokens":15,"prompt_tokens_details":null,"total_tokens":527}}