Skip to content

feat: add JSON mode and reasoning_effort support#197

Closed
kolmogorov-quyet wants to merge 2 commits intoNevaMind-AI:mainfrom
kolmogorov-quyet:feat/json-mode-reasoning-effort
Closed

feat: add JSON mode and reasoning_effort support#197
kolmogorov-quyet wants to merge 2 commits intoNevaMind-AI:mainfrom
kolmogorov-quyet:feat/json-mode-reasoning-effort

Conversation

@kolmogorov-quyet
Copy link
Copy Markdown

Summary

Add two new features to improve LLM output quality and control:

1. JSON Mode (response_format="json_object")

  • Add response_format parameter to summarize() method
  • Enable JSON mode for LLM rankers in retrieve.py:
    • _llm_rank_categories
    • _llm_rank_items
    • _llm_rank_resources
  • Enable JSON mode for _preprocess_conversation in memorize.py
  • Support in openai_sdk.py, wrapper.py, and http_client.py

2. Reasoning Effort (reasoning_effort)

  • Add reasoning_effort field to LLMConfig in settings.py
  • Accepts values: "low", "medium", "high", or None
  • Default value is "high" for better reasoning quality
  • Pass through openai_sdk.py to chat.completions.create()
  • Wire up in service.py when creating LLM clients

Why these changes?

  1. JSON Mode: Ensures consistent JSON output from LLMs, reducing parsing errors like "No JSON object found"

  2. Reasoning Effort: Enables control over reasoning depth for compatible models (OpenAI o-series, Cerebras, etc.)

Add two new features to improve LLM output quality and control:

1. JSON Mode (response_format="json_object"):
   - Add response_format parameter to summarize() method
   - Enable JSON mode for LLM rankers in retrieve.py:
     - _llm_rank_categories
     - _llm_rank_items
     - _llm_rank_resources
   - Enable JSON mode for _preprocess_conversation in memorize.py
   - Support in openai_sdk.py, wrapper.py, and http_client.py

2. Reasoning Effort (reasoning_effort="low"|"medium"|"high"):
   - Add reasoning_effort field to LLMConfig in settings.py
   - Default value is "high" for better reasoning quality
   - Pass through openai_sdk.py to chat.completions.create()
   - Wire up in service.py when creating LLM clients

These features help ensure consistent JSON output from LLMs and
enable control over reasoning depth for compatible models.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@sairin1202
Copy link
Copy Markdown
Contributor

Hi kolmogorov-quyet, thanks for the contribution! One small suggestion: it might be better to make these arguments optional, since not all models support them. As it stands, passing them unconditionally can cause errors for some models, for example:
openai.BadRequestError: Error code: 400 - {'error': {'message': 'Unrecognized request argument supplied: reasoning_effort', 'type': 'invalid_request_error', 'param': None, 'code': None}

@kolmogorov-quyet kolmogorov-quyet closed this by deleting the head repository Jan 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants