-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Add a JSONL implementation with benchmark and findings #2868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: fc9d4f5 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The changes in this pull request are cohesive and related to implementing and benchmarking a JSONL format for task message storage. Therefore, it does not need to be split into smaller pull requests. The benchmark results, script, and storage utilities are all part of this implementation and should remain together. |
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
|
can't wait to see this merged 💪🏻 |
|
Do you think jsonl is less likely to get corrupted if the process is force closed or dies? |
|
Yes - that's definitely a benefit. |
| import { getTaskDirectoryPath } from "../../shared/storagePathManager" | ||
|
|
||
| const taskSizeCache = new NodeCache({ stdTTL: 5 * 60, checkperiod: 30 }) | ||
| const taskSizeCache = new NodeCache({ stdTTL: 30, checkperiod: 5 * 60 }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I swapped these original; the TTL should be 30s and the check period doesn't really matter.
|
Any blockers here? |
Current blocker is a migration function to convert the old format to the new format, similar to this: https://github.com/RooVetGit/Roo-Code/blob/main/src/core/task-persistence/apiMessages.ts#L26 |
|
stale |
Context
TODO:
I was surprised at how well the pure
JSONapproach works for huge conversation histories. In short, there are definitely real wins withJSONL, but I'm not sure it matters in practice. I also triedYAMLbut theJSONparser is way faster natively.Here's what Roo Code came up with:
Recommendation
Strongly recommend adopting the JSONL implementation for task message storage for the following reasons:
Superior Performance: Significantly faster, especially for sequential operations that mirror real-world usage patterns (6.56x speedup)
Better Scaling: Performance remains consistent regardless of conversation size
Lower Memory Footprint: Only needs to process the new message, not the entire conversation history
Append-Optimized: Perfectly suited for chat applications where new messages are frequently added
Streaming Compatibility: Easier to implement streaming reads for large conversation histories
The performance advantage of JSONL becomes increasingly significant as conversations grow larger, making it the clear choice for a chat-based application like Roo Code.
Implementation
Screenshots
How to Test
Get in Touch
Important
Add JSONL implementation for task message storage with improved performance and adjust cache settings.
readTaskMessages,writeTaskMessages, andappendTaskMessagefunctions intaskMessages.jsonl.tsfor handling task messages in JSONL format.readlinefor efficient line-by-line reading andfsfor file operations.taskSizeCacheTTL to 30 seconds and check period to 5 minutes intaskMetadata.ts.This description was created by
for fc9d4f5. You can customize this summary. It will automatically update as commits are pushed.