Skip to content

Commit ae24353

Browse files
committed
fix parser
2 parents bd192b2 + 6265f43 commit ae24353

File tree

175 files changed

+11461
-1082
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

175 files changed

+11461
-1082
lines changed

.github/workflows/_base_test.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,8 @@ jobs:
143143
-v "${CACHE_DIR}/ConfigDir:/root/.config" \
144144
-e TZ="Asia/Shanghai" \
145145
--gpus '"device='"${DEVICES}"'"' ${docker_image} /bin/bash -xc '
146-
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
146+
# python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
147+
python -m pip install paddlepaddle-gpu==3.3.0.dev20250917 -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
147148
148149
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
149150

.github/workflows/_build_linux.yml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,12 @@ jobs:
106106
CARD_ID=$(echo "${runner_name}" | awk -F'-' '{print $NF}')
107107
gpu_id=$(echo "$CARD_ID" | fold -w1 | paste -sd,)
108108
109-
CACHE_DIR="${CACHE_DIR:-$(dirname "$(dirname "${{ github.workspace }}")")}"
109+
IFS='/' read -ra parts <<< "${GITHUB_WORKSPACE}"
110+
len=${#parts[@]}
111+
CCACHE_DEFAULT_DIR="/$(IFS=/; echo "${parts[*]:1:$((len-5))}")"
112+
echo "$CCACHE_DEFAULT_DIR"
113+
114+
CACHE_DIR="${CACHE_DIR:-$CCACHE_DEFAULT_DIR}"
110115
echo "CACHE_DIR is set to ${CACHE_DIR}"
111116
if [ ! -f "${CACHE_DIR}/gitconfig" ]; then
112117
touch "${CACHE_DIR}/gitconfig"
@@ -127,6 +132,7 @@ jobs:
127132
-e "PADDLEVERSION=${PADDLEVERSION}" \
128133
-e "PADDLE_WHL_URL=${PADDLE_WHL_URL}" \
129134
-e "BRANCH_REF=${BRANCH_REF}" \
135+
-e "CCACHE_MAXSIZE=50G" \
130136
--gpus "\"device=${gpu_id}\"" ${docker_image} /bin/bash -c '
131137
if [[ -n "${FD_VERSION}" ]]; then
132138
export FASTDEPLOY_VERSION=${FD_VERSION}

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ English | [简体中文](README_CN.md)
4343
- 🤝 **OpenAI API Server and vLLM Compatible**: One-command deployment with [vLLM](https://github.com/vllm-project/vllm/) interface compatibility.
4444
- 🧮 **Comprehensive Quantization Format Support**: W8A16, W8A8, W4A16, W4A8, W2A16, FP8, and more.
4545
-**Advanced Acceleration Techniques**: Speculative decoding, Multi-Token Prediction (MTP) and Chunked Prefill.
46-
- 🖥️ **Multi-Hardware Support**: NVIDIA GPU, Kunlunxin XPU, Hygon DCU, Ascend NPU, Iluvatar GPU, Enflame GCU, MetaX GPU etc.
46+
- 🖥️ **Multi-Hardware Support**: NVIDIA GPU, Kunlunxin XPU, Hygon DCU, Ascend NPU, Iluvatar GPU, Enflame GCU, MetaX GPU, Intel Gaudi etc.
4747

4848
## Requirements
4949

@@ -60,6 +60,7 @@ FastDeploy supports inference deployment on **NVIDIA GPUs**, **Kunlunxin XPUs**,
6060
- [Enflame GCU](./docs/get_started/installation/Enflame_gcu.md)
6161
- [Hygon DCU](./docs/get_started/installation/hygon_dcu.md)
6262
- [MetaX GPU](./docs/get_started/installation/metax_gpu.md)
63+
- [Intel Gaudi](./docs/get_started/installation/intel_gaudi.md)
6364

6465
**Note:** We are actively working on expanding hardware support. Additional hardware platforms including Ascend NPU are currently under development and testing. Stay tuned for updates!
6566

README_CN.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@
4141
- 🤝 **OpenAI API服务与vLLM兼容**:单命令部署,兼容[vLLM](https://github.com/vllm-project/vllm/)接口
4242
- 🧮 **全量化格式支持**:W8A16、W8A8、W4A16、W4A8、W2A16、FP8等
4343
-**高级加速技术**:推测解码、多令牌预测(MTP)及分块预填充
44-
- 🖥️ **多硬件支持**:NVIDIA GPU、昆仑芯XPU、海光DCU、昇腾NPU、天数智芯GPU、燧原GCU、沐曦GPU等
44+
- 🖥️ **多硬件支持**:NVIDIA GPU、昆仑芯XPU、海光DCU、昇腾NPU、天数智芯GPU、燧原GCU、沐曦GPU、英特尔Gaudi等
4545

4646
## 要求
4747

@@ -58,6 +58,7 @@ FastDeploy 支持在**英伟达(NVIDIA)GPU**、**昆仑芯(Kunlunxin)XPU
5858
- [燧原 S60](./docs/zh/get_started/installation/Enflame_gcu.md)
5959
- [海光 DCU](./docs/zh/get_started/installation/hygon_dcu.md)
6060
- [沐曦 GPU](./docs/zh/get_started/installation/metax_gpu.md)
61+
- [英特尔 Gaudi](./docs/zh/get_started/installation/intel_gaudi.md)
6162

6263
**注意:** 我们正在积极拓展硬件支持范围。目前,包括昇腾(Ascend)NPU 等其他硬件平台正在开发测试中。敬请关注更新!
6364

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
max_model_len: 32768
2+
max_num_seqs: 128
3+
tensor_parallel_size: 4
4+
use_cudagraph: True
5+
load_choices: "default_v1"
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
max_model_len: 32768
2+
max_num_seqs: 128
3+
tensor_parallel_size: 4
4+
use_cudagraph: True
5+
load_choices: "default_v1"
6+
quantization: wfp8afp8
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
top_p: 0.95
2+
temperature: 0.6
3+
metadata:
4+
min_tokens: 1
5+
max_tokens: 12288
6+
repetition_penalty: 1.0
7+
frequency_penalty: 0
8+
presence_penalty: 0

build.sh

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,12 @@ function copy_ops(){
128128
echo -e "MACA ops have been copy to fastdeploy"
129129
return
130130
fi
131+
is_intel_hpu=`$python -c "import paddle; print(paddle.is_compiled_with_custom_device('intel_hpu'))"`
132+
if [ "$is_intel_hpu" = "True" ]; then
133+
DEVICE_TYPE="intel-hpu"
134+
echo -e "intel_hpu ops have been copy to fastdeploy"
135+
return
136+
fi
131137

132138
DEVICE_TYPE="cpu"
133139
cd ../../../../
@@ -159,7 +165,9 @@ function build_and_install_ops() {
159165
else
160166
FD_BUILDING_ARCS=${FD_BUILDING_ARCS} ${python} setup_ops.py install --install-lib ${OPS_TMP_DIR}
161167
fi
162-
find ${OPS_TMP_DIR} -type f -name "*.o" -exec rm -f {} \;
168+
if [ -d "${OPS_TMP_DIR}" ]; then
169+
find ${OPS_TMP_DIR} -type f -name "*.o" -exec rm -f {} \;
170+
fi
163171
else
164172
echo "Error: Invalid parameter '$FD_CPU_USE_BF16'. Please use true or false."
165173
exit 1

custom_ops/gpu_ops/append_attn/encoder_write_cache_with_rope_impl.cuh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1004,7 +1004,8 @@ __global__ void cache_kernel(
10041004
const uint32_t qkv_bias = bias % hidden_size;
10051005
const uint32_t hi = qkv_bias / head_size;
10061006
const uint32_t h_bias = qkv_bias % head_size;
1007-
const uint32_t ori_bi = batch_id_per_token[token_idx];
1007+
const int32_t ori_bi = batch_id_per_token[token_idx];
1008+
if (ori_bi == -1) continue; // skip batch_id_per_token[token_idx]=-1
10081009
if (seq_lens[ori_bi] == 0) continue;
10091010
const uint32_t ori_seq_id = (token_idx - cu_seqlens_q[ori_bi]) + seq_lens_decoder[ori_bi];
10101011

File renamed without changes.

0 commit comments

Comments
 (0)