-
-
Notifications
You must be signed in to change notification settings - Fork 13k
[responsesAPI] get reasoning token metrics for simpleContext #31839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Andrew Xia <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the capability to compute reasoning token metrics within SimpleContext. The implementation involves a new _compute_reasoning_tokens method that gets triggered when the final output is accessed. The changes are accompanied by a good set of tests covering various scenarios. My review focuses on enhancing the robustness of the new logic, specifically around exception handling and ensuring the correctness of the calculated metrics.
| self.num_reasoning_tokens = len(self._accumulated_token_ids) - len( | ||
| content_token_ids | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calculation for num_reasoning_tokens could result in a negative value if len(content_token_ids) is greater than len(self._accumulated_token_ids). While extract_content_ids should ideally return a subset of the input tokens, a bug in a parser implementation could violate this. To make this more robust, I suggest adding a check to ensure num_reasoning_tokens is not negative, which would also improve debugging by logging the unexpected behavior.
| self.num_reasoning_tokens = len(self._accumulated_token_ids) - len( | |
| content_token_ids | |
| ) | |
| num_reasoning_tokens = len(self._accumulated_token_ids) - len(content_token_ids) | |
| if num_reasoning_tokens < 0: | |
| logger.warning("Calculated negative reasoning tokens (%d). Clamping to 0.", num_reasoning_tokens) | |
| num_reasoning_tokens = 0 | |
| self.num_reasoning_tokens = num_reasoning_tokens |
| ) | ||
|
|
||
| self._reasoning_tokens_computed = True | ||
| except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a broad except Exception: can mask underlying issues like TypeError or AttributeError, which might indicate programming errors rather than parsing failures. This makes debugging more difficult. It is recommended to catch more specific exceptions that are expected during parsing. If the parser can raise various exceptions, consider defining a custom ReasoningParsingError and wrapping the original exceptions to make the intent clear and avoid catching unrelated errors.
Purpose
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.