Skip to content

Conversation

antznette1
Copy link
Contributor

  • Add stream=false support in POST /v1/responses returning a single OpenAI Response.
  • Emit streaming deltas consistently; avoid placeholder logprobs by using empty lists.
  • Stabilize tests: robust event checks; deterministic full request.

Files:

  • src/transformers/commands/serving.py
  • tests/commands/test_serving.py

Notes: Minimal diff, aligns with AGENTS.md. Improves OpenAI parity.

@Rocketknight1
Copy link
Member

cc @gante @LysandreJik

Copy link
Member

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clean PR 🙏 LGTM!

@gante gante enabled auto-merge (squash) October 6, 2025 15:55
@gante gante merged commit e00f46f into huggingface:main Oct 6, 2025
17 checks passed
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@antznette1
Copy link
Contributor Author

Hi! The docs preview link for this PR isn’t opening. I’m seeing the “INDEX doesn’t exist in pr_41353” message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants