Skip to content

Conversation

pei0033
Copy link

@pei0033 pei0033 commented Jun 11, 2025

What does the PR do?

This PR adds comprehensive guided decoding support to the OpenAI frontend, enabling users to constrain model outputs to specific formats through the OpenAI-compatible API.
The implementation supports both vLLM and TensorRT-LLM backends with multiple guide types including JSON schema, regex patterns, choice-based selection, and EBNF grammar.

Checklist

  • I have read the Contribution guidelines and signed the Contributor License
    Agreement
  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • I ran pre-commit locally (pre-commit install, pre-commit run --all)
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

N/A

Where should the reviewer start?

Please focus on these key files:

  1. python/openai/openai_frontend/schemas/openai.py - Review the new schema fields guided_decoding_guide_type and guided_decoding_guide added to both completion request models
  2. python/openai/openai_frontend/engine/utils/triton.py - Check the implementation of guided decoding integration for both vLLM and TensorRT-LLM backends
  3. python/openai/README.md - Verify the comprehensive documentation and examples for different guide types

Test plan:

please follow the codes in README.md

Caveats:

  1. Different usage patterns per backend:
  2. Guided decoding may not function properly when used in conjunction with tool calling
  3. Currently relies on backend validation; frontend doesn't validate guide format compatibility

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant