Skip to content

Commit 17e3b6b

Browse files
fix(telemetry): Crash monitoring fixes (#5741)
## Problem Crash monitoring is reporting incorrect crash metrics. This seems to be due to various filesystem errors such as eperm (even though we were doing an operation on a file we created), enospc (the user ran out of space on their machine, and other errors. Because of this we ran in to situations where our state did not reflect reality, and due to this certain extension instances were seen as crashed. ## Solution - Determine if a filesystem is reliable on a machine (try a bunch of different filesystem flows and ensure nothing throws), if it is THEN we start the crash monitoring process. Otherwise we do not run it since we cannot rely it will be accurate. - We added a `function_call` metric to allow us to determine the ratio of successes to failures - Add retries to critical filesystem operations such as the heartbeats and deleting a crashed extension instance from the state. - Other various fixes --- <!--- REMINDER: Ensure that your PR meets the guidelines in CONTRIBUTING.md --> License: I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: nkomonen-amazon <[email protected]>
1 parent c50ca80 commit 17e3b6b

File tree

12 files changed

+463
-171
lines changed

12 files changed

+463
-171
lines changed

packages/amazonq/src/extensionNode.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,8 @@ export async function activate(context: vscode.ExtensionContext) {
3232
* the code compatible with web and move it to {@link activateAmazonQCommon}.
3333
*/
3434
async function activateAmazonQNode(context: vscode.ExtensionContext) {
35-
await (await CrashMonitoring.instance()).start()
35+
// Intentionally do not await since this is slow and non-critical
36+
void (await CrashMonitoring.instance())?.start()
3637

3738
const extContext = {
3839
extensionContext: context,
@@ -96,5 +97,5 @@ async function setupDevMode(context: vscode.ExtensionContext) {
9697

9798
export async function deactivate() {
9899
// Run concurrently to speed up execution. stop() does not throw so it is safe
99-
await Promise.all([(await CrashMonitoring.instance()).stop(), deactivateCommon()])
100+
await Promise.all([(await CrashMonitoring.instance())?.shutdown(), deactivateCommon()])
100101
}

packages/core/src/extensionNode.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,8 @@ export async function activate(context: vscode.ExtensionContext) {
7878
// IMPORTANT: If you are doing setup that should also work in web mode (browser), it should be done in the function below
7979
const extContext = await activateCommon(context, contextPrefix, false)
8080

81-
await (await CrashMonitoring.instance()).start()
81+
// Intentionally do not await since this can be slow and non-critical
82+
void (await CrashMonitoring.instance())?.start()
8283

8384
initializeCredentialsProviderManager()
8485

@@ -254,7 +255,7 @@ export async function activate(context: vscode.ExtensionContext) {
254255

255256
export async function deactivate() {
256257
// Run concurrently to speed up execution. stop() does not throw so it is safe
257-
await Promise.all([await (await CrashMonitoring.instance()).stop(), deactivateCommon()])
258+
await Promise.all([await (await CrashMonitoring.instance())?.shutdown(), deactivateCommon()])
258259
await globals.resourceManager.dispose()
259260
}
260261

0 commit comments

Comments
 (0)