📋 All Releases English | 简体中文
Release date: 2024-05-20
Version 1.4.0 focuses on API enhancements, with core highlights including comprehensive function call support, API caching capabilities, and code execution for Google Gemini models. This release also includes multiple performance optimizations and bug fixes.
- Comprehensive Function Call Support: All API endpoints now support function calls, including the general OpenAI-compatible chat completion API, Google Gemini API, and Azure OpenAI Response API
- API Caching Feature: New caching capability using
/v1-cachedor/v1-cached-createOnlyendpoints with significantly optimized performance - Google Gemini Code Execution: Google Gemini models now support code execution functionality
- Performance Optimizations: Multiple API performance improvements, including async processing and reduced duplicate database calls
This version implements function call support across all API endpoints:
OpenAI Compatible API
- General chat completion API (
/v1/chat/completions) fully supports tool calls - New
ToolCallSegmenttype for tool call fragments in streaming responses - Optimized function call response logic to ensure correct first response timing
Google AI / Gemini API
- Google AI now supports tool calls
- Google Gemini models now support code execution functionality
- Both tool calls and code execution work seamlessly in chat completion flows
Azure OpenAI Response API
- Response API now supports tool calls
- Confirmed Response API working correctly through testing
- Fixed issue where reasoning content was not displayed
To improve API performance and reduce costs, a new user-level API caching mechanism has been added:
Cache Endpoints
/v1-cached: Chat completion endpoint with caching enabled/v1-cached-createOnly: Chat completion endpoint for cache creation only
Cache Features
- New
UserApiCacheentity for storing user API request caches - Cache system supports tool call scenarios
- Added cache metrics for monitoring cache effectiveness
- Cache usage saved asynchronously to avoid blocking main flow
Performance Improvements
- Avoid duplicate authentication database calls
- Async client info fetching to optimize API response speed
- Multiple cache performance optimizations significantly improve API throughput
Logging Optimizations
- Ignore routine log output from OpenAI compatible controller
- Suppress error logs for o3/o4-mini models
- Added necessary warning prompts
Monitoring Enhancements
- Added cache metrics statistics
- Improved observability of tool call processes
Message Processing
- Added merge logic in full chat completion
- Fixed issue where reasoning content was not displayed
- Optimized first response timing
Compatibility
- Improved compatibility with various model providers
- Ensured tool calls work correctly in different scenarios
UserApiCache: User API cache entityToolCallSegment: Tool call segment type
- Added:
/v1-cached- Chat completion with caching enabled - Added:
/v1-cached-createOnly- Chat completion for cache creation only
- Added
UserApiCacherelated table structures - Cache usage statistics fields
- Caching Feature: To use the API caching feature, use the new
/v1-cachedor/v1-cached-createOnlyendpoints - Tool Calls: All APIs now support function calls without special configuration
- Performance Improvements: This version includes multiple performance optimizations; API response speed will be noticeably improved after upgrade
- Caching feature is experimental; thorough testing in production environments is recommended before use
- Google Gemini code execution functionality requires model support
View the complete commit history: 1.3.1.794...1.4.0.815
Key commits:
- suppress o3/o4-mini error log
- confirmed response api works
- google gemini supports code execution
- correct function call response logic
- correct first response tick
- google ai now supports tool call
- response api also supports tool call
- fix reasoning content not show issue
- save cache usage in async way
- avoid duplicated authentication db call for api
- add cache metrics
- optimize performance by async client info call
- cache also supports tool call
- confirmed cache working
- implement cache support
- initial commit of google ai tool
- Add merge logic in full chat completion
- add ToolCallSegment
- initial commit of UserApiCache