Skip to content

Feat: add command-line arguments for backend parameters #86

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 12, 2025

Conversation

SyedaAnshrahGillani
Copy link
Contributor

This pull request enhances the flexibility of the gpt_oss/generate.py script
by replacing hardcoded backend parameters with configurable command-line
arguments.

Previously, the triton and vllm backends had fixed values for context and
tensor_parallel_size, respectively. This made it difficult to adapt the
script to different models or hardware configurations without modifying the
source code.

This PR introduces two new arguments:

  • --context-length: Allows customization of the context length for the triton
    backend (defaults to 4096).
  • --tensor-parallel-size: Allows customization of the tensor parallel size for
    the vllm backend (defaults to 2).

Additionally, the variable decoded_token has been renamed to token_text for
improved clarity.

These changes make the generation script more versatile and user-friendly,
allowing for easier experimentation with different backend settings.

@dkundel-openai dkundel-openai merged commit 4195fb3 into openai:main Aug 12, 2025
Danztee pushed a commit to Danztee/gpt-oss that referenced this pull request Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants