Skip to content

Commit 0a53f41

Browse files
only Acknowledgement remain
Signed-off-by: zRzRzRzRzRzRzR <[email protected]>
1 parent c6531d2 commit 0a53f41

File tree

1 file changed

+13
-3
lines changed

1 file changed

+13
-3
lines changed

_posts/2025-08-15-glm45-vllm.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
22
layout: post
3-
title: "Use vLLM to speed "
3+
title: "Use vLLM to deploy GLM-4.5 and GLM-4.5V model"
44
author: "Yuxuan Zhang"
55
image: /assets/logos/vllm-logo-text-light.png
66
---
77

8-
# Introduction
8+
# Model Introduction
99

1010
The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total
1111
parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total
@@ -98,6 +98,16 @@ In the response, the special tokens `<|begin_of_box|>` and `<|end_of_box|>` are
9898
the answer. The bracket style may vary ([], [[]], (), <>, etc.), but the meaning is the same: to enclose the coordinates
9999
of the box.
100100

101+
## Cooperation with vLLM and Z.ai Team
102+
103+
During the release of the GLM-4.5 and GLM-4.5V models, the vLLM team worked closely with the Z.ai team, providing
104+
extensive support in addressing issues related to the model launch.
105+
The GLM-4.5 and GLM-4.5V models provided by the Z.ai team were modified in the vLLM implementation PR, including (but
106+
not limited to) resolving [CUDA Core Dump](./2025-08-11-cuda-debugging.md) debugging issues and FP8 model accuracy
107+
alignment problems.
108+
They also ensured that the vLLM `main` branch had full support for the open-source GLM-4.5 series before the models were
109+
released.
110+
101111
## Acknowledgement
102112

103-
vLLM team members who contributed to this effort are: Simon Mo, Kaichao You.
113+
We would like to thank the vLLM team members who contributed to this effort are: Simon Mo, Kaichao You.

0 commit comments

Comments
 (0)