Skip to content

Conversation

ggerganov
Copy link
Member

The t_token_generaion could become zero, causing division by zero here:

timings.predicted_per_second = 1e3 / t_token_generation * n_decoded;

@ggerganov ggerganov merged commit e60f01d into master Oct 10, 2025
69 checks passed
@ggerganov ggerganov deleted the gg/server-fix-div-by-0 branch October 10, 2025 19:15
yael-works pushed a commit to yael-works/llama.cpp that referenced this pull request Oct 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants