-
Notifications
You must be signed in to change notification settings - Fork 747
feat(nep): Data Instrumentation #7109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(nep): Data Instrumentation #7109
Conversation
|
| @@ -0,0 +1,118 @@ | |||
| /*! | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this these NEP modules live in the same dir as the other "trackers"?
https://github.com/aws/aws-toolkit-vscode/tree/master/packages/core/src/codewhisperer/tracker
are they really not sharing any concepts at all? this PR is all new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not shared IMO, the existing tracker collects statistical metrics for telemetry, the NEP tracker tracks content changes. Their hyper parameters are also very different.
Since the new tracker is populating the supplementalContext field, maybe we can also consider https://github.com/aws/aws-toolkit-vscode/tree/master/packages/core/src/codewhisperer/util/supplementalContext?
opieter-aws
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is clean up of snapshots not needed when the extension is closed?
packages/core/src/codewhisperer/service/recommendationHandler.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
| let localize: nls.LocalizeFunc | ||
|
|
||
| export async function activate(context: ExtContext): Promise<void> { | ||
| activateNextEditPrediction(context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I would move this to the bottom of the function since the beginning is for more general activation (eg auth).
-
Also I think it is better to have this in
activateAmazonQCommon(). We want to move as much Q specific code in to their so that we can reduce our dependency oncore. You may need to export this in one of theindex.ts, seeamazonQDiffSchemeand how it is exported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : Lets scope the namining to EditTracking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this needs to have it's own index.ts and export it to QCommon, since it's not a stand-alone feature, but rather an addition on existing codeWhisper API flow. Will move it to the bottom of codeWhisper activation.
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
| predictionSupplementalContext = await predictionTracker.generatePredictionSupplementalContext() | ||
| } | ||
| } catch (error) { | ||
| getLogger().error(`Error getting prediction supplemental context: ${error}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have the final call, but is it dangerous to swallow the error here? If a user is having issues we will never have any telemetry for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not dangerous, we are collecting initial data, and from the bug-bash no issues has been found. Telemetry will come later in this project.
The main thing is if anything goes wrong it doesn't crash the inline completions flow, hence we swallow it here.
dk19y
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this codebase will move to Flare eventually. Can we do a minimal refactor this to make the migration straight forward ?
APIs which may not have a direct mapping
| VS Code API | LSP Equivalent | Notes |
|---|---|---|
vscode.window.onDidChangeVisibleTextEditors |
No direct equivalent in LSP | LSP focuses on opened/closed documents (didOpen, didClose) not "visibility". You may need to track didOpen/didClose and assume editors are visible. |
vscode.workspace.onDidChangeTextDocument |
textDocument/didChange (notification) |
LSP server receives textDocument/didChange to reflect edits in documents. |
vscode.window.visibleTextEditors |
No direct equivalent in LSP | LSP does not track visible editors. It only tracks open documents via didOpen, didClose, didChange. |
vscode.window.activeTextEditor |
No direct equivalent in LSP | LSP does not track which document is active. It only manages documents' content and lifecycles. (The client — e.g., VS Code — knows which one is active.) |
Also lets not use fs APIs and rely on message passing as-much as we can.
| let localize: nls.LocalizeFunc | ||
|
|
||
| export async function activate(context: ExtContext): Promise<void> { | ||
| activateNextEditPrediction(context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : Lets scope the namining to EditTracking
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionTracker.ts
Outdated
Show resolved
Hide resolved
|
/runIntegrationTests |
packages/core/src/codewhisperer/nextEditPrediction/PredictionKeyStrokeHandler.ts
Show resolved
Hide resolved
packages/core/src/codewhisperer/nextEditPrediction/PredictionKeyStrokeHandler.ts
Show resolved
Hide resolved
## Problem The science & service team in Next Edit Prediction (NEP) project needs initial data for model training and tuning, to provide more contextually relevant suggestions. This requires us to track user edits in the IDE, and send changes (of current active file in unified diff format) to the codeWhisper API. **No impact on user experience.** **This won't live forever, will be migrated to flare by end of May, needed now for science data collection.** ## Solution ### Key Components `PredictionKeyStrokeHandler`: Listens for document changes, maintains shadow copies of visible documents, and processes edits. `PredictionTracker`: Manages file snapshots, implementing a policy for storing, retrieving, and pruning snapshots based on age and memory constraints. `DiffGenerator`: Creates unified diffs between file snapshots, produces `supplementalContext` sent to the API. ### How it Works - The system track shadow copies of editor visible files' content - Once an edit is made to a tracked file, it takes a snapshot of the file content before the edit - When the Inline API fires, the snapshots of the current editing files are used to generate diff context ### Memory management - maxTotalSizeKb (default: 5000): Caps total size of snapshots storage at ~5MB, purging oldest snapshots when exceeded. - debounceIntervalMs (default: 2000): Prevents excessive snapshots by requiring 2 seconds between captures for the same file. - maxAgeMs (default: 30000): Auto-deletes snapshots after 30 seconds to maintain recent-only history. - maxSupplementalContext (default: 15): Limits `supplementalContext` sent to API to 15 entries maximum. ### Changes - Added new NextEditPrediction module in the CodeWhisperer package - Updated activation code to initialize the NEP system - Updated codeWhisper inline API requests to fit new format --- - Treat all work as PUBLIC. Private `feature/x` branches will not be squash-merged at release time. - Your code changes must meet the guidelines in [CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines). - License: I confirm that my contribution is made under the terms of the Apache 2.0 license.
Problem
The science & service team in Next Edit Prediction (NEP) project needs initial data for model training and tuning, to provide more contextually relevant suggestions.
This requires us to track user edits in the IDE, and send changes (of current active file in unified diff format) to the codeWhisper API.
No impact on user experience.
This won't live forever, will be migrated to flare by end of May, needed now for science data collection.
Solution
Key Components
PredictionKeyStrokeHandler: Listens for document changes, maintains shadow copies of visible documents, and processes edits.PredictionTracker: Manages file snapshots, implementing a policy for storing, retrieving, and pruning snapshots based on age and memory constraints.DiffGenerator: Creates unified diffs between file snapshots, producessupplementalContextsent to the API.How it Works
Memory management
supplementalContextsent to API to 15 entries maximum.Changes
feature/xbranches will not be squash-merged at release time.