Skip to content

Add vllm:request_max_num_generation_tokens metric #243

@mayabar

Description

@mayabar

vllm:request_max_num_generation_tokens - This is the minimum of max-model-len - prompt length and max_tokens if defined.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions