[feat] completion api supports passing input token ids in either `prompt` or `prompt_token_ids` #3311

liyonghua0910 · 2025-08-11T06:35:23Z

需求描述

Completion 接口需要支持在 prompt 字段直传 token ids 作为模型输入，与 vLLM 对齐。同时，在 FD v2.0.4 中新增的 prompt_token_ids 仍然有效，优先级暂定为 prompt_token_ids > prompt。另外，原来版本 prompt_token_ids 仅支持单条请求推理，现新增对批量推理的支持。

单条推理：

curl -X POST "http://0.0.0.0:8185/v1/completions" \
    -H "Content-Type: application/json" \
    -d '{
        "prompt": [123, 456, 789],
        "max_tokens": 10
    }'

curl -X POST "http://0.0.0.0:8185/v1/completions" \
    -H "Content-Type: application/json" \
    -d '{
        "prompt": "",
        "prompt_token_ids": [123, 456, 789],
        "max_tokens": 10
    }'

批量推理：

curl -X POST "http://0.0.0.0:8185/v1/completions" \
    -H "Content-Type: application/json" \
    -d '{
        "prompt": [[123, 456, 789], [987, 654, 321]],
        "max_tokens": 10
    }'

curl -X POST "http://0.0.0.0:8185/v1/completions" \
    -H "Content-Type: application/json" \
    -d '{
        "prompt": "",
        "prompt_token_ids": [[123, 456, 789], [987, 654, 321]],
        "max_tokens": 10
    }'

主要改动

fastdeploy/entrypoints/openai/serving_completion.py：新增一段对 prompt_token_ids 批量推理的处理，且优先级高于 prompt 字段
fastdeploy/input/ernie_processor.py：重构了对于传 prompt 字段的处理逻辑，如果该条 prompt 为 list 则直接写入 request.prompt_token_ids，否则为 str 时经过 tokenization 后再写入 request.prompt_token_ids。同时修改了 process_request 和 process_request_dict 方法
fastdeploy/input/text_processor.py：同上
test/ci_use/EB_Lite/test_EB_Lite_serving.py：新增对于 prompt 字段直传 token ids 以及 prompt/prompt_token_ids 字段批量推理的测试用例
test/ci_use/Qwen2-7B-Instruct_serving/test_Qwen2-7B-Instruct_serving.py：同上

paddle-bot · 2025-08-11T06:35:28Z

Thanks for your contribution!

…mpt` or `prompt_token_ids`

fastdeploy/input/ernie_processor.py

…-ids

codecov-commenter · 2025-08-28T08:01:14Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@808b548). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #3311   +/-   ##
==========================================
  Coverage           ?   20.54%           
==========================================
  Files              ?        4           
  Lines              ?       73           
  Branches           ?       19           
==========================================
  Hits               ?       15           
  Misses             ?       54           
  Partials           ?        4

Flag	Coverage Δ
diff	`20.54% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…-ids

paddle-bot bot added the contributor External developers label Aug 11, 2025

liyonghua0910 added 2 commits August 11, 2025 06:39

[feat] completion api supports passing input token ids in either `pro…

420435c

…mpt` or `prompt_token_ids`

[fix] update comment

bccc742

liyonghua0910 force-pushed the dev-prompt-token-ids branch from 7481c9f to bccc742 Compare August 11, 2025 06:40

[merge] merged with latest source code

abf8172

LiqinruiG reviewed Aug 11, 2025

View reviewed changes

fastdeploy/input/ernie_processor.py Outdated Show resolved Hide resolved

liyonghua0910 added 11 commits August 12, 2025 04:15

[fix] fix type error

e3c4583

[test] add a unittest file for serving api test

9cd3af1

[test] try to fix ci error

856eb22

[chore] rename test function names

22eaf5a

[test] try to fix ci error

f314b72

[test] try to fix ci error

275752c

Merge remote-tracking branch 'upstream/develop' into dev-prompt-token…

8640c35

…-ids

Merge remote-tracking branch 'upstream/develop' into dev-prompt-token…

42a1921

…-ids

[test] add tests for qwen

58952cd

Merge remote-tracking branch 'upstream/develop' into dev-prompt-token…

1f3211f

…-ids

Merge remote-tracking branch 'upstream/develop' into dev-prompt-token…

c6e0e7b

…-ids

liyonghua0910 and others added 3 commits August 28, 2025 09:15

Merge remote-tracking branch 'upstream/develop' into dev-prompt-token…

9407b5c

…-ids

Merge branch 'develop' into dev-prompt-token-ids

f95a375

Merge remote-tracking branch 'upstream/develop' into dev-prompt-token…

9c23aab

…-ids

YuanRisheng added the skip-ci: coverage label Aug 29, 2025

Jiang-Jia-Jun approved these changes Aug 29, 2025

View reviewed changes

liyonghua0910 merged commit 8829724 into PaddlePaddle:develop Aug 29, 2025
43 of 49 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feat] completion api supports passing input token ids in either `prompt` or `prompt_token_ids` #3311

[feat] completion api supports passing input token ids in either `prompt` or `prompt_token_ids` #3311

liyonghua0910 commented Aug 11, 2025

Uh oh!

paddle-bot bot commented Aug 11, 2025

Uh oh!

Uh oh!

codecov-commenter commented Aug 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[feat] completion api supports passing input token ids in either prompt or prompt_token_ids #3311

[feat] completion api supports passing input token ids in either prompt or prompt_token_ids #3311

Conversation

liyonghua0910 commented Aug 11, 2025

需求描述

主要改动

Uh oh!

paddle-bot bot commented Aug 11, 2025

Uh oh!

Uh oh!

codecov-commenter commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

[feat] completion api supports passing input token ids in either `prompt` or `prompt_token_ids` #3311

[feat] completion api supports passing input token ids in either `prompt` or `prompt_token_ids` #3311

codecov-commenter commented Aug 28, 2025 •

edited

Loading