Skip to content

perf: remodel TTFT#610

Draft
Harrilee wants to merge 1 commit intomainfrom
harrli/ttft_3x
Draft

perf: remodel TTFT#610
Harrilee wants to merge 1 commit intomainfrom
harrli/ttft_3x

Conversation

@Harrilee
Copy link
Contributor

Remodel TTFT generation by modeling TRTLLM overlap mode (TTFT = 3x mixed step latency when bs > 4)

@Harrilee
Copy link
Contributor Author

Harrilee commented Mar 18, 2026

Preliminary Results

image
Commit: 46a3f23  (n=42 datapoints) [Baseline]
  TTFT APE  — mean=34.8%  median=29.7%  p90=68.8%  max=92.2%
  TPOT APE  — mean=6.6%  median=5.8%  p90=12.5%  max=17.9%

Commit: 5aefb74  (n=42 datapoints) [After ZigZag fix]
  TTFT APE  — mean=34.2%  median=29.7%  p90=58.5%  max=89.6%
  TPOT APE  — mean=6.6%  median=5.8%  p90=12.5%  max=17.9%

Commit: 923caba  (n=42 datapoints) [Current PR]
  TTFT APE  — mean=37.5%  median=35.2%  p90=61.6%  max=89.9%
  TPOT APE  — mean=6.9%  median=6.2%  p90=12.7%  max=17.9%

Will investigate on gaps on higher batch sizes

@Harrilee Harrilee changed the title enhance: remodel TTFT perf+: remodel TTFT Mar 18, 2026
@Harrilee Harrilee changed the title perf+: remodel TTFT perf: remodel TTFT Mar 18, 2026
@github-actions github-actions bot added the perf label Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant