Skip to content

Conversation

@Arui1122
Copy link

@Arui1122 Arui1122 commented Oct 1, 2025

Related Issues or Context

Fixes token counting issue for OpenAI-compatible APIs in streaming mode.

OpenAI-compatible providers (like LiteLLM) do not return token usage by default in streaming responses. This causes Dify to fall back to token estimation instead of using the actual count from the provider.

This PR contains Changes to Non-Plugin

  • Documentation
  • Other

This PR contains Changes to Non-LLM Models Plugin

  • I have Run Comprehensive Tests Relevant to My Changes

This PR contains Changes to LLM Models Plugin

  • My Changes Affect Message Flow Handling (System Messages and User→Assistant Turn-Taking)

  • My Changes Affect Tool Interaction Flow (Multi-Round Usage and Output Handling, for both Agent App and Agent Node)

  • My Changes Affect Multimodal Input Handling (Images, PDFs, Audio, Video, etc.)

  • My Changes Affect Multimodal Output Generation (Images, Audio, Video, etc.)

  • My Changes Affect Structured Output Format (JSON, XML, etc.)

  • My Changes Affect Token Consumption Metrics

  • My Changes Affect Other LLM Functionalities (Reasoning Process, Grounding, Prompt Caching, etc.)

  • Other Changes (Add New Models, Fix Model Parameters etc.)

Version Control

  • I have Bumped Up the Version in Manifest.yaml (Top-Level Version Field, Not in Meta Section)

Dify Plugin SDK Version

  • I have Ensured dify_plugin>=0.3.0,<0.5.0 is in requirements.txt (SDK docs)

Environment Verification

Local Deployment Environment

  • Dify Version is: 1.7.1, I have Tested My Changes on Local Deployment Dify with a Clean Environment That Matches the Production Configuration.

SaaS Environment

  • I have Tested My Changes on cloud.dify.ai with a Clean Environment That Matches the Production Configuration

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Oct 1, 2025
@crazywoola crazywoola requested a review from Copilot October 8, 2025 02:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes token counting issues for OpenAI-compatible APIs when using streaming mode by implementing request usage data collection and adding proper stream options configuration.

Key changes:

  • Overrides the _generate method to request usage data in streaming mode via stream_options
  • Adds comprehensive OpenAI-compatible API handling with proper headers, authentication, and response formatting
  • Updates the manifest version to reflect the fix

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
models/openai_api_compatible/models/llm/llm.py Implements custom _generate method with streaming usage data collection and complete OpenAI API compatibility
models/openai_api_compatible/manifest.yaml Bumps version from 0.0.22 to 0.0.23

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant