Skip to content

Conversation

max-wittig
Copy link
Contributor

@max-wittig max-wittig commented Aug 20, 2025

Some LLMs don't support all params that clients could send. This MR gives the administrator the ability to filter those out

Similar to https://docs.litellm.ai/docs/completion/drop_params

Tested using

python3 -m vllm_router.app --port "$1" \
    --service-discovery static \
    --static-backends "http://localhost:11434" \
    --static-models "qwen3" \
    --static-model-types "chat" \
    --drop-params "prompt" \
    --routing-logic roundrobin

  • Make sure the code changes pass the pre-commit checks.
  • Sign-off your commit by using -s when doing git commit
  • Try to classify PRs for easy understanding of the type of changes, such as [Bugfix], [Feat], and [CI].
Detailed Checklist (Click to Expand)

Thank you for your contribution to production-stack! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

  • [Bugfix] for bug fixes.
  • [CI/Build] for build or continuous integration improvements.
  • [Doc] for documentation fixes and improvements.
  • [Feat] for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).
  • [Router] for changes to the vllm_router (e.g., routing algorithm, router observability, etc.).
  • [Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

  • Pass all linter checks. Please use pre-commit to format your code. See README.md for installation.
  • The code need to be well-documented to ensure future contributors can easily understand the code.
  • Please include sufficient tests to ensure the change is stay correct and robust. This includes both unit tests and integration tests.

DCO and Signed-off-by

When contributing changes to this project, you must agree to the DCO. Commits must include a Signed-off-by: header which certifies agreement with the terms of the DCO.

Using -s with git commit will automatically add this header.

What to Expect for the Reviews

We aim to address all PRs in a timely manner. If no one reviews your PR within 5 days, please @-mention one of YuhanLiu11
, Shaoting-Feng or ApostaC.

@max-wittig max-wittig force-pushed the feat/drop_params branch 2 times, most recently from 2384a1b to ce957f1 Compare August 20, 2025 13:37
@max-wittig max-wittig marked this pull request as ready for review August 20, 2025 13:54
@max-wittig
Copy link
Contributor Author

@YuhanLiu11 WDYT?

@YuhanLiu11
Copy link
Collaborator

@max-wittig This would be nice to have. Can you write some tests for this feature?

@max-wittig
Copy link
Contributor Author

@YuhanLiu11 I wanted to add some unittests, but the whole route_general_request has no tests at the moment.

Is there any place where you could point me to where to add them?

@YuhanLiu11
Copy link
Collaborator

@YuhanLiu11 I wanted to add some unittests, but the whole route_general_request has no tests at the moment.

Is there any place where you could point me to where to add them?

You can add the tests in https://github.com/vllm-project/production-stack/blob/main/tests/e2e/test-routing.py to send a request that specify params to drop and see if it did drop params from the request.

request_body = json.dumps(request_json)
update_content_length(request, request_body)

# TODO (ApostaC): merge two awaits into one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please remove this comment? It is not relevant now.

@max-wittig max-wittig force-pushed the feat/drop_params branch 9 times, most recently from a99afa2 to 828ce8c Compare September 3, 2025 10:53
Some LLMs don't support all params that clients could send. This MR gives the administrator the ability to filter those out

Similar to https://docs.litellm.ai/docs/completion/drop_params

Signed-off-by: Max Wittig <[email protected]>
Signed-off-by: Max Wittig <[email protected]>
@max-wittig max-wittig marked this pull request as draft September 3, 2025 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants