Skip to content

fix: GPT-5 model config and streaming#277

Merged
longwind48 merged 3 commits intostgfrom
chore/gpt_5
Mar 9, 2026
Merged

fix: GPT-5 model config and streaming#277
longwind48 merged 3 commits intostgfrom
chore/gpt_5

Conversation

@longwind48
Copy link
Collaborator

@longwind48 longwind48 commented Mar 8, 2026

Summary

  • Configure GPT-5 reasoning model parameters: explicit temperature=1.0 (overrides langchain default of 0.7 which GPT-5 rejects), max_completion_tokens=4096 (up from 512, needed for reasoning overhead)
  • Filter empty content chunks during streaming to prevent client-side errors from GPT-5's reasoning tokens
  • Read model name from AZURE_OPENAI_DEPLOYMENT_NAME env var instead of hardcoding
  • Clean up .env.example: remove unused vars (GAR_IMAGE, GAR_MEMORY), add missing vars (embedding model, FB_CLIENT_EMAIL, Slack, processor URL)

Known issues

Chat UI does not render markdown formatting — response text appears as a wall of text with no newlines, bullet points, or headings. This makes responses hard to read, especially for longer answers. See #278.

Test plan

  • Verified chatbot streaming works end-to-end with GPT-5 (finish_reason: stop)
  • Confirmed no BadRequestError for unsupported temperature values
  • Confirmed response content flows correctly (no empty chunks sent to client)

🤖 Generated with Claude Code

longwind48 and others added 2 commits March 8, 2026 14:41
GPT-5 is a reasoning model that does not support temperature, top_p,
presence_penalty, or frequency_penalty parameters. Remove these and
use max_completion_tokens instead of max_tokens.
GPT-5 is a reasoning model that requires temperature=1 (explicit to
override langchain default of 0.7) and higher max_completion_tokens
(4096) to accommodate reasoning overhead. Also filter empty content
chunks during streaming to prevent client-side errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Read AZURE_OPENAI_DEPLOYMENT_NAME from env instead of hardcoding
  the model name in config.py
- Remove unused GAR_IMAGE/GAR_MEMORY from .env.example
- Add missing env vars: embedding model, FB_CLIENT_EMAIL, Slack,
  and processor service URL

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@longwind48 longwind48 requested a review from yevkim March 8, 2026 07:59
Copy link
Collaborator

@yevkim yevkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@longwind48 longwind48 merged commit dbe37c4 into stg Mar 9, 2026
5 checks passed
@longwind48 longwind48 deleted the chore/gpt_5 branch March 9, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants