-
Notifications
You must be signed in to change notification settings - Fork 5
feat: Implement hybrid token-based conversation history system #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 6 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
69b220e
feat: Implement hybrid token-based conversation history system
475da21
feat: during load only verify max token limit and filter old records …
f65f622
feat: refactor ai code
9c2f5bb
feat: refactor ai code
9ad9634
feat: fix error
a5927f4
feat: cleanup
24f9cc7
feat: fix response token
3b54c38
feat:
3160bd1
feat: fix build
c6bac7b
feat: try to fix build
91cb5ec
feat: try to fix build
c087999
feat: try to fix build
1fd2a73
feat: try to fix build
07b1b50
feat: try to fix build
2772d2b
feat: fix build
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,104 @@ | ||
| """ | ||
| Token calculation utilities for conversation history. | ||
|
|
||
| This module provides utilities for calculating token counts from text | ||
| using the approximation that 1 token ≈ 4 characters of English text. | ||
| """ | ||
|
|
||
| from mcp_as_a_judge.constants import MAX_CONTEXT_TOKENS | ||
|
|
||
|
|
||
| def calculate_tokens(text: str) -> int: | ||
| """ | ||
| Calculate approximate token count from text. | ||
|
|
||
| Uses the approximation that 1 token ≈ 4 characters of English text. | ||
| This is a simple heuristic that works reasonably well for most text. | ||
|
|
||
| Args: | ||
| text: Input text to calculate tokens for | ||
|
|
||
| Returns: | ||
| Approximate token count (rounded up to nearest integer) | ||
| """ | ||
| if not text: | ||
| return 0 | ||
|
|
||
| # Use ceiling division to round up: (len(text) + 3) // 4 | ||
| # This ensures we don't underestimate token count | ||
| return (len(text) + 3) // 4 | ||
|
|
||
|
|
||
| def calculate_record_tokens(input_text: str, output_text: str) -> int: | ||
| """ | ||
| Calculate total token count for input and output text. | ||
|
|
||
| Combines the token counts of input and output text. | ||
|
|
||
| Args: | ||
| input_text: Input text string | ||
| output_text: Output text string | ||
|
|
||
| Returns: | ||
| Combined token count for both input and output | ||
| """ | ||
| return calculate_tokens(input_text) + calculate_tokens(output_text) | ||
|
|
||
|
|
||
| def calculate_total_tokens(records: list) -> int: | ||
| """ | ||
| Calculate total token count for a list of conversation records. | ||
|
|
||
| Args: | ||
| records: List of ConversationRecord objects with tokens field | ||
|
|
||
| Returns: | ||
| Sum of all token counts in the records | ||
| """ | ||
| return sum(record.tokens for record in records if hasattr(record, "tokens")) | ||
|
|
||
|
|
||
| def filter_records_by_token_limit(records: list, current_prompt: str = "") -> list: | ||
| """ | ||
| Filter conversation records to stay within token and record limits. | ||
|
|
||
| Removes oldest records (FIFO) when token limit is exceeded while | ||
| trying to keep as many recent records as possible. | ||
|
|
||
| Args: | ||
| records: List of ConversationRecord objects (assumed to be in reverse chronological order) | ||
| max_records: Maximum number of records to keep (optional) | ||
| current_prompt: Current prompt that will be sent to LLM (for token calculation) | ||
|
|
||
| Returns: | ||
| Filtered list of records that fit within the limits | ||
| """ | ||
| if not records: | ||
| return [] | ||
|
|
||
| # Calculate current prompt tokens | ||
| current_prompt_tokens = ( | ||
| calculate_record_tokens(current_prompt, "") if current_prompt else 0 | ||
| ) | ||
|
|
||
| # Calculate total tokens including current prompt | ||
| history_tokens = calculate_total_tokens(records) | ||
| total_tokens = history_tokens + current_prompt_tokens | ||
|
|
||
| # If total tokens (history + current prompt) are within limit, return all records | ||
| if total_tokens <= MAX_CONTEXT_TOKENS: | ||
| return records | ||
|
|
||
| # Remove oldest records (from the end since records are in reverse chronological order) | ||
| # until history + current prompt fit within the token limit | ||
| filtered_records = records.copy() | ||
| current_history_tokens = history_tokens | ||
|
|
||
| while (current_history_tokens + current_prompt_tokens) > MAX_CONTEXT_TOKENS and len( | ||
| filtered_records | ||
| ) > 1: | ||
| # Remove the oldest record (last in the list) | ||
| removed_record = filtered_records.pop() | ||
| current_history_tokens -= getattr(removed_record, "tokens", 0) | ||
|
|
||
| return filtered_records |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is less recommended
I suggest: https://docs.litellm.ai/docs/completion/token_usage#3-token_counter
If the client provide
LLM_API_KEYwe can get the model name.Else, if it uses sampling, it's a bit more tricky: