-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Implement conversation logging and fine-tuning dataset generation #5430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…n and configuration updates
- Implemented `create-finetuning-data.ts` to parse log files and convert them into a fine-tuning dataset format. - Added functionality to read from a specified input directory and output the dataset to a JSONL file. - Introduced interfaces for log entries and Gemini message structures to facilitate data processing. - Enhanced error handling for malformed log entries and directory reading. - Updated `tsconfig.json` to include necessary TypeScript configurations for the new script.
Add script to convert conversation logs into valid supervised fine-tuning data for Gemini models. Fixes message ordering for tool calls and ensures proper structure compliance with Google Cloud documentation.
…versationLogger integration
… and add session filtering options
…ore to include logging README
| const workspaceRoot = vscode.workspace.workspaceFolders?.[0]?.uri.fsPath | ||
| if (workspaceRoot) { | ||
| const logger = new ConversationLogger(workspaceRoot) | ||
| context.globalState.update("conversationLogger", logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider not storing a non-serializable ConversationLogger instance in context.globalState (line 96). Global state is meant for simple serializable data. Perhaps store the logger in a disposable or module-level variable instead.
| * @returns The new session ID. | ||
| */ | ||
| public startNewSession(): string { | ||
| this.sessionId = this.generateSessionId() |
Check failure
Code scanning / CodeQL
Insecure randomness High
Math.random()
…ipt for fine-tuning
…dow for non-thinking models
|
Thank you for your contribution. To ensure we can properly review and integrate your work, could you please provide some more context? The scope isn't entirely clear, and we couldn't find an associated issue that this pull request addresses. If this is intended to fix a bug, please feel free to open an issue to discuss the problem and solution so we can figure out if this PR fixes the problem. If this is a new feature, then please submit a detailed feature proposal. This will help us understand the problem you're solving and how your contribution fits into the project. We appreciate you taking the time to contribute to Roo Code! |
Introduce a ConversationLogger service for enhanced logging capabilities and add a script to create fine-tuning datasets from conversation logs. Update configurations and error handling to support new features, while ensuring proper logging and dataset formats for Gemini models.
Important
Introduces
ConversationLoggerfor logging conversations and updatescreate-finetuning-data.tsto generate fine-tuning datasets for Gemini and OpenAI, with comprehensive tests and configuration updates.ConversationLoggerinConversationLogger.tsto log user messages, AI responses, and tool calls for fine-tuning datasets..roo-logsdirectory.create-finetuning-data.tsto process logs into Gemini and OpenAI fine-tuning datasets.--geminiand--openaiflags for format selection.ConversationLogger.spec.tsto validate logging and dataset generation.package.jsonto include new logging settings and dependencies.This description was created by
for 6a58d28. You can customize this summary. It will automatically update as commits are pushed.