Skip to content

05-Chunked-Prefills 问题反馈 & 课后思考题 #1

@cr7258

Description

@cr7258

课后思考题

  • chunked-prefills 对 TTFT(首 token 延迟)和 TBT(token 间延迟)的影响是怎么样的?
  • 为什么要限制每轮调度的 chunk size?
  • stall-free scheduling 的调度逻辑是怎么样的?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions