Skip to content

Commit 53b6e6f

Browse files
JimHsiungyq33victor
authored andcommitted
release: update xllm release version to v0.7.0.
1 parent eee3ee9 commit 53b6e6f

File tree

2 files changed

+37
-1
lines changed

2 files changed

+37
-1
lines changed

RELEASE.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,39 @@
1+
# Release xllm 0.7.0
2+
3+
## **Major Features and Improvements**
4+
5+
### Model Support
6+
7+
- Support GLM-4.5.
8+
- Support Qwen3-Embedding.
9+
- Support Qwen3-VL.
10+
- Support FluxFill.
11+
12+
### Feature
13+
- Support MLU backend, currently supports Qwen3 series models.
14+
- Support dynamic disaggregated PD, with dynamic switching between P and D phases based on strategy.
15+
- Support multi-stream parallel overlap optimization.
16+
- Support beam-search capability in generative models.
17+
- Support virtual memory continuous kv-cache capability.
18+
- Support ACL graph executor.
19+
- Support unified online-offline co-location scheduling in disaggregated PD scenarios.
20+
- Support PrefillOnly Scheduler.
21+
- Support v1/rerank model service interface.
22+
- Support communication between devices via shared memory instead of RPC on a single machine.
23+
- Support function call.
24+
- Support reasoning output in chat interface.
25+
- Support top-k+add fusion in the router component of MoE models.
26+
- Support offline inference for LLM, VLM, and Embedding models.
27+
- Optimized certain runtime performance.
28+
29+
### Bugfix
30+
- Skip cancelled requests when processing stream output.
31+
- Resolve segmentation fault during qwen3 quantized inference.
32+
- Fix the alignment of monitoring metrics format for Prometheus.
33+
- Clear outdated tensors to save memory when loading model weights.
34+
- Fix attention mask to support long sequence requests.
35+
- Fix bugs caused by enabling scheduler overlap.
36+
137
# Release xllm 0.6.0
238

339
## **Major Features and Improvements**

version.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.6.0
1+
0.7.0

0 commit comments

Comments
 (0)