Releases: shell-nlp/gpt_server
Releases · shell-nlp/gpt_server
gpt_server v0.3.6
11 Feb 02:50
Compare
Sorry, something went wrong.
No results found
gpt_server v0.3.6
What's Changed
vllm 后端支持QVQ模型 #28
更新 vllm==0.7.2 lmdeploy=0.7.0.post3 transformers==4.48.2 pynvml==12.0.0
修复lmdepy 解析模型类型的bug
支持文本审核模型
支持edge-TTS
HF后端支持guided_decoding
使用高性能json序列化库
优化架构
gpt_server v0.3.5
20 Dec 08:17
Compare
Sorry, something went wrong.
No results found
gpt_server v0.3.5
What's Changed
兼容v1/rerank 接口 #25 #6
修复glm4的推理问题 #21
更新infinity==0.0.73 vllm==0.6.5
支持了 Phi-4
优化了 Function Calling
将pip 项目管理修改为了 uv
gpt_server v0.3.4
18 Nov 15:01
Compare
Sorry, something went wrong.
No results found
gpt_server v0.3.4
What's Changed
实现了guided_decoding response_format #17
修复了glm4模型推理异常 #21
升级vllm==0.6.4.post1
gpt_server v0.3.3
30 Oct 14:54
Compare
Sorry, something went wrong.
No results found
gpt_server v0.3.3
What's Changed
支持可视化配置
修改配置文件架构
升级lmdeploy=0.6.2
开放 dtype 配置 #19
优化架构
gpt_server v0.3.2
15 Oct 13:54
Compare
Sorry, something went wrong.
No results found
gpt_server v0.3.2
What's Changed
支持了多模态模型 qwen VL #14 、minicpmv
支持了 embedding 模型 puff
修复了lmdeploy后端的bug
开放 prefix_caching能力
优化架构
gpt_server v0.3.1
28 Jul 15:40
Compare
Sorry, something went wrong.
No results found
gpt_server v0.3.1
What's Changed
支持了Infinity后端,推理速度大于onnx/tensorrt,支持动态组批
支持了多模态模型 glm-4v-gb 的LMDeploy PyTorch后端 #11
优化了requirements #12
优化了配置文件的结构
gpt_server v0.2.2
24 Jun 08:34
Compare
Sorry, something went wrong.
No results found
gpt_server v0.2.2
更新
全球唯一支持Tools(Function Calling)功能的开源框架。兼容LangChain的 bind_tools、AgentExecutor、with_structured_output写法(目前支持Qwen系列、GLM系列)#4
修复了Embeding加载设备错误的问题 #8
支持了Qwen2和GLM4模型
修复了completion的问题
v0.2.1
23 May 05:47
Compare
Sorry, something went wrong.
No results found
gpt_server v0.2.1
1.支持了Lmdeploy后端
2.修复了对请求数量的限制
3.修复了logs目录错误问题