-
Notifications
You must be signed in to change notification settings - Fork 182
Record EPP NormalizedTimePerOutputToken metric on streaming mode #1706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Record EPP NormalizedTimePerOutputToken metric on streaming mode #1706
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dharaneeshvrd The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @delavet |
@dharaneeshvrd: GitHub didn't allow me to request PR reviews from the following users: delavet. Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hey @dharaneeshvrd! Thanks for the PR, do you mind adding this metric to our hermetic tests to validate the behavior?
|
462c169
to
e652c5a
Compare
e652c5a
to
46c873a
Compare
Update e2e/epp/e2e_test & integration/epp/hermetic_test to validate inference_objective_normalized_time_per_output_token_seconds metric Signed-off-by: Dharaneeshwaran Ravichandran <[email protected]>
46c873a
to
6c7ce3e
Compare
@kfswain Updated the hermetic test. PTAL! |
What type of PR is this?
/kind bug
/kind failing-test
What this PR does / why we need it:
Add code changes to record NormalizedTimePerOutputToken metric in EPP, which is expected in e2e epp test.
Which issue(s) this PR fixes:
Fixes #939
Does this PR introduce a user-facing change?: