Skip to content

Commit 65575a0

Browse files
authored
perf: optimize telemetry initialization and reduce startup overhead (#3620)
* perf: optimize telemetry initialization and reduce startup overhead This commit significantly improves CLI startup performance by optimizing how OpenTelemetry and Sentry telemetry is initialized and sent. Key improvements: - Remove upfront initializeInstrumentation() call to eliminate 28,000+ require-in-the-middle hook registrations on every command - Implement lazy initialization of OpenTelemetry during exit handler - Separate Sentry initialization to only load when errors occur - Cache auth token to avoid recreating Config/APIClient instances - Remove empty instrumentation registration that added overhead - Consolidate platform checks into isTelemetryEnabled() utility - Fix duplicate setupTelemetry() calls that overwrote timing data - Batch file existence checks in analytics using Promise.all - Remove blocking await in beforeExit handler - Add analytics-telemetry debug scope for troubleshooting Performance results: - 17-21% faster startup time (0.770s -> 0.608s) - Zero require-in-the-middle overhead (28,003 lines -> 0) - HTTP calls still made successfully to Honeycomb/Sentry - All telemetry data identical to production build The telemetry functionality remains fully intact - we've only optimized when and how the infrastructure is initialized. * feat: add finally hook to report command errors to Sentry - Created finally hook that sends command errors to Sentry and Honeycomb - Filters out 4xx client errors (user errors) to reduce noise - Only reports 5xx server errors and internal CLI exceptions - Restored missing sentryClient variable in global_telemetry.ts * fix: filter out command_not_found errors from Sentry Command not found errors are user typos, not bugs. Filter them out by: - Matching the error message pattern - Checking for exit code 127 (standard "command not found" code) * perf: spawn telemetry worker process for non-blocking collection Implemented background worker process for telemetry to eliminate blocking: - Created telemetry_worker.ts that handles all OpenTelemetry/Sentry initialization - Spawn detached worker process via stdin for data transfer - Main CLI exits immediately without waiting for HTTP requests - Worker inherits stderr for DEBUG=analytics-telemetry visibility Performance improvement: ~25% faster (0.608s → 0.45s) Also enhanced debug output: - Added payload logging for Honeycomb and Sentry - Shows full telemetry data structure before sending - Helps verify data and troubleshoot issues * refactor: move telemetry files to lib/analytics-telemetry Organized telemetry code into its own directory to avoid confusion with src/commands/telemetry: - Moved global_telemetry.ts to lib/analytics-telemetry/global-telemetry.ts - Moved telemetry_worker.ts to lib/analytics-telemetry/telemetry-worker.ts - Renamed files to use hyphens instead of underscores - Updated all import paths across hooks, bin/run.js, and tests - Fixed process.exit linter rule for telemetry-worker.ts * refactor: modularize telemetry code and add comprehensive tests Split monolithic global-telemetry.ts (357 lines) into focused modules: - telemetry-utils.ts: Shared utilities, types, and helpers - honeycomb-client.ts: OpenTelemetry/Honeycomb integration - sentry-client.ts: Sentry error reporting - global-telemetry.ts: Thin orchestrator (123 lines) - worker-client.ts: Background worker process management Refactored bin/run.js from 116 to 34 lines by extracting telemetry setup and signal handlers into worker-client.ts. Added comprehensive test coverage (32 tests): - test/unit/analytics-telemetry/telemetry-utils.unit.test.ts (12 tests) - test/unit/analytics-telemetry/honeycomb-client.unit.test.ts (6 tests) - test/unit/analytics-telemetry/sentry-client.unit.test.ts (5 tests) - test/unit/analytics-telemetry/global-telemetry.unit.test.ts (9 tests) Benefits: - Single responsibility per module - Easier to test in isolation - Easier to maintain and understand - Public API unchanged (backward compatible) * chore: remove old global_telemetry test file The tests have been replaced with new modular tests in test/unit/analytics-telemetry/ * refactor: replace 'any' types with proper TypeScript types in telemetry Improvements: - Extended CLIError interface with all error properties (code, statusCode, http, oclif) - Added TelemetryData union type for Telemetry | CLIError - Added TelemetryGlobal interface for global.cliTelemetry - Added TelemetryOptions interface for hook options - Used Config type from @oclif/core/interfaces instead of 'any' - Replaced all '(data as any)' casts with proper CLIError type - Replaced all '(global as any)' casts with proper global typing - Added proper declare global block in worker-client.ts All 32 tests passing with improved type safety. * revert: restore analytics.ts to main branch version The changes to analytics.ts (reformatting and Promise.all optimization) were unintentional and not relevant to this telemetry refactoring PR. Reverted to keep the PR focused on telemetry-specific improvements. * fix: add Windows compatibility for telemetry worker process Add windowsHide: true option to spawn() calls to prevent console windows from appearing on Windows when telemetry is explicitly enabled. This ensures a better user experience for Windows users who enable telemetry with ENABLE_WINDOWS_TELEMETRY=true. Changes: - Added windowsHide: true to worker-client.ts spawn options - Added windowsHide: true to finally hook spawn options - Prevents console window flash on Windows - No impact on Unix/macOS behavior * refactor: remove unnecessary re-exports from global-telemetry Remove all re-exports from global-telemetry.ts and have consumers import utilities directly from telemetry-utils.ts where appropriate. Changes: - Removed re-exports of getProcessor, initializeInstrumentation - Removed re-exports of ensureSentryInitialized - Removed re-exports of computeDuration, isTelemetryEnabled - Removed re-exports of types (CLIError, Telemetry, TelemetryGlobal) Updated imports: - bin/run.js: Import computeDuration from telemetry-utils - All hooks: Import isTelemetryEnabled from telemetry-utils - Hooks still import orchestrator functions from global-telemetry (setupTelemetry, reportCmdNotFound) Updated tests: - Removed duplicate computeDuration and isTelemetryEnabled tests from global-telemetry.unit.test.ts (already tested in telemetry-utils.unit.test.ts) - Removed unused sinon imports - Test file now only tests orchestrator functions Windows compatibility: - Added windowsHide: true to sentry.ts spawn call to prevent console windows on Windows Benefits: - Clearer API surface - global-telemetry only exports what it owns - No unnecessary indirection for utility functions - Makes it obvious which functions are orchestrators vs utilities - Better test organization - tests colocated with implementations - Reduces coupling between modules The public API of global-telemetry now consists only of: - setupTelemetry() - reportCmdNotFound() - sendTelemetry() * include oclif:perf with oclif:* * refactor: consolidate duplicate telemetry functions Move serializeTelemetryData and spawnTelemetryWorker functions from worker-client.ts and sentry.ts to telemetry-utils.ts to eliminate duplication. Keep isUserError in sentry.ts since it's only used there for Sentry-specific error filtering.
1 parent bde5a5c commit 65575a0

19 files changed

+1145
-490
lines changed

bin/run.js

Lines changed: 13 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
#!/usr/bin/env -S node --no-deprecation
2-
/* eslint-disable n/no-process-exit */
2+
33
/* eslint-disable n/no-unpublished-bin */
44

55
import {execute, settings} from '@oclif/core'
66

7-
// Enable performance tracking when DEBUG=oclif:perf or DEBUG=* is set
8-
if (process.env.DEBUG?.includes('oclif:perf') || process.env.DEBUG === '*') {
7+
// Enable performance tracking when oclif:perf is specified in DEBUG
8+
if (process.env.DEBUG?.includes('oclif:perf') || process.env.DEBUG === 'oclif:*' || process.env.DEBUG === '*') {
99
settings.performanceEnabled = true
1010
}
1111

@@ -16,54 +16,18 @@ const cliStartTime = now.getTime()
1616

1717
// Skip telemetry entirely on Windows for performance (unless explicitly enabled)
1818
const enableTelemetry = process.platform !== 'win32' || process.env.ENABLE_WINDOWS_TELEMETRY === 'true'
19-
let globalTelemetry
20-
21-
if (enableTelemetry) {
22-
// Dynamically import telemetry only when needed
23-
globalTelemetry = await import('../dist/global_telemetry.js')
24-
}
25-
26-
process.once('beforeExit', async code => {
27-
if (!enableTelemetry) return
28-
29-
// capture as successful exit
30-
if (global.cliTelemetry) {
31-
if (global.cliTelemetry.isVersionOrHelp) {
32-
const cmdStartTime = global.cliTelemetry.commandRunDuration
33-
global.cliTelemetry.commandRunDuration = globalTelemetry.computeDuration(cmdStartTime)
34-
}
35-
36-
global.cliTelemetry.exitCode = code
37-
global.cliTelemetry.cliRunDuration = globalTelemetry.computeDuration(cliStartTime)
38-
const telemetryData = global.cliTelemetry
39-
await globalTelemetry.sendTelemetry(telemetryData)
40-
}
41-
})
42-
43-
process.on('SIGINT', () => {
44-
if (enableTelemetry) {
45-
// Fire-and-forget: attempt to send telemetry but don't block exit
46-
const error = new Error('Received SIGINT')
47-
error.cliRunDuration = globalTelemetry.computeDuration(cliStartTime)
48-
globalTelemetry.sendTelemetry(error).catch(() => {})
49-
}
50-
51-
process.exit(1)
52-
})
53-
54-
process.on('SIGTERM', () => {
55-
if (enableTelemetry) {
56-
// Fire-and-forget: attempt to send telemetry but don't block exit
57-
const error = new Error('Received SIGTERM')
58-
error.cliRunDuration = globalTelemetry.computeDuration(cliStartTime)
59-
globalTelemetry.sendTelemetry(error).catch(() => {})
60-
}
61-
62-
process.exit(1)
63-
})
6419

6520
if (enableTelemetry) {
66-
globalTelemetry.initializeInstrumentation()
21+
// Dynamically import telemetry modules
22+
const {setupTelemetryHandlers} = await import('../dist/lib/analytics-telemetry/worker-client.js')
23+
const {computeDuration} = await import('../dist/lib/analytics-telemetry/telemetry-utils.js')
24+
25+
// Setup all telemetry handlers (beforeExit, SIGINT, SIGTERM)
26+
setupTelemetryHandlers({
27+
cliStartTime,
28+
computeDuration,
29+
enableTelemetry,
30+
})
6731
}
6832

6933
await execute({dir: import.meta.url})

package.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,9 @@
184184
"command_not_found": [
185185
"./dist/hooks/command_not_found/performance_analytics"
186186
],
187+
"finally": [
188+
"./dist/hooks/finally/sentry"
189+
],
187190
"init": [
188191
"./dist/hooks/init/version",
189192
"./dist/hooks/init/terms-of-service",

src/global_telemetry.ts

Lines changed: 0 additions & 275 deletions
This file was deleted.
Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
11
import {Hook} from '@oclif/core/hooks'
22

33
const performance_analytics: Hook<'command_not_found'> = async function () {
4-
// Skip telemetry on Windows for performance (unless explicitly enabled)
5-
if (process.platform === 'win32' && process.env.ENABLE_WINDOWS_TELEMETRY !== 'true') {
4+
const {isTelemetryEnabled} = await import('../../lib/analytics-telemetry/telemetry-utils.js')
5+
const {reportCmdNotFound} = await import('../../lib/analytics-telemetry/global-telemetry.js')
6+
7+
// Use the consolidated telemetry check
8+
if (!isTelemetryEnabled()) {
69
return
710
}
811

9-
const telemetry = await import('../../global_telemetry.js');
10-
(global as any).cliTelemetry = telemetry.reportCmdNotFound(this.config)
12+
(global as any).cliTelemetry = reportCmdNotFound(this.config)
1113
}
1214

1315
export default performance_analytics

0 commit comments

Comments
 (0)