Skip to content

Commit 2f5ff3a

Browse files
feat: telemetry stack trace - runWithCallEntry() (#5442)
See documentation in this commit for explanation of Problem + Solution for this feature. Signed-off-by: Nikolas Komonen <[email protected]>
1 parent 5bb74e4 commit 2f5ff3a

File tree

7 files changed

+476
-19
lines changed

7 files changed

+476
-19
lines changed

docs/telemetry.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,3 +139,124 @@ Finally, if `setupStep2()` was the thing that failed we would see a metric like:
139139
...
140140
}
141141
```
142+
143+
## Adding a "Stack Trace" to your metric
144+
145+
### Problem
146+
147+
Common example: _"I have a function, `thisFailsSometimes()` that is called in multiple places. The function sometimes fails, I know from telemetry, but I do not know if it is failing when it is a specific caller. If I knew the call stack/trace that it took to call my function that would help me debug."_
148+
149+
```typescript
150+
function outerA() {
151+
thisFailsSometimes(1) // this succeeds
152+
}
153+
154+
function outerB() {
155+
thisFailsSometimes(0) // this fails
156+
}
157+
158+
function thisFailsSometimes(num: number) {
159+
return telemetry.my_Metric.run(() => {
160+
if (number === 0) {
161+
throw Error('Cannot be 0')
162+
}
163+
...
164+
})
165+
}
166+
```
167+
168+
### Solution
169+
170+
Add a value to `function` in the options of a `run()`. This will result in a stack of functions identifiers that were previously called
171+
before `thisFailsSometimes()` was run. You can then retrieve the stack in the `run()` of your final metric using `getFunctionStack()`.
172+
173+
```typescript
174+
function outerA() {
175+
telemetry.my_Metric.run(() => thisFailsSometimes(1), { functionId: { name: 'outerA' }})
176+
}
177+
178+
function outerB() {
179+
telemetry.my_Metric.run(() => thisFailsSometimes(0), { functionId: { source: 'outerB' }})
180+
}
181+
182+
function thisFailsSometimes(num: number) {
183+
return telemetry.my_Metric.run(() => {
184+
telemetry.record({ theCallStack: asStringifiedStack(telemetry.getFunctionStack())})
185+
if (number === 0) {
186+
throw Error('Cannot be 0')
187+
}
188+
...
189+
}, { functionId: { name: 'thisFailsSometimes' }})
190+
}
191+
192+
// Results in a metric: { theCallStack: 'outerB:thisFailsSometimes', result: 'Failed' }
193+
// { theCallStack: 'outerB:thisFailsSometimes' } implies 'outerB' was run first, then 'thisFailsSometimes'. See docstrings for more info.
194+
outerB()
195+
```
196+
197+
### Important Notes
198+
199+
- If a nested function does not use a `run()` then it will not be part of the call stack.
200+
201+
```typescript
202+
function a() {
203+
return telemetry.my_Metric.run(() => b(), { functionId: { name: 'a' } })
204+
}
205+
206+
function b() {
207+
return c()
208+
}
209+
210+
function c() {
211+
return telemetry.my_Metric.run(() => asStringifiedStack(telemetry.getFunctionStack()), {
212+
functionId: { name: 'c' },
213+
})
214+
}
215+
216+
c() // result: 'a:c', note that 'b' is not included
217+
```
218+
219+
- If you are using `run()` with a class method, you can also add the class to the entry for more context
220+
221+
```typescript
222+
class A {
223+
a() {
224+
return telemetry.my_Metric.run(() => this.b(), { functionId: { name: 'a', class: 'A' } })
225+
}
226+
227+
b() {
228+
return telemetry.my_Metric.run(() => asStringifiedStack(telemetry.getFunctionStack()), {
229+
functionId: { name: 'b', class: 'A' },
230+
})
231+
}
232+
}
233+
234+
const inst = new A()
235+
inst.a() // 'A#a,b'
236+
```
237+
238+
- If you do not want your `run()` to emit telemetry, set `emit: false` in the options
239+
240+
```typescript
241+
function a() {
242+
return telemetry.my_Metric.run(() => b(), { functionId: { name: 'a' }, emit: false })
243+
}
244+
```
245+
246+
- If you want to add to the function stack, but don't have a specific metric to use,
247+
use the metric named `function_call`. Also look to set `emit: false` if you do not
248+
want it to emit telemetry.
249+
250+
```typescript
251+
function a() {
252+
return telemetry.function_call.run(() => b(), { functionId: { name: 'a' }, emit: false })
253+
}
254+
```
255+
256+
- If your function name is generic, look to make it unique so there is no confusion.
257+
258+
```typescript
259+
function a() {
260+
return telemetry.my_Metric.run(() => b(), { functionId: { name: 'aButMoreUnique' } })
261+
}
262+
```

packages/core/src/shared/telemetry/spans.ts

Lines changed: 154 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,40 @@ function getValidatedState(state: Partial<MetricBase>, definition: MetricDefinit
9090
return missingFields.length !== 0 ? Object.assign({ missingFields }, state) : state
9191
}
9292

93+
/**
94+
* Options used for the creation of a span
95+
*/
96+
export type SpanOptions = {
97+
/** True if this span should emit its telemetry events. Defaults to true if undefined. */
98+
emit?: boolean
99+
100+
/**
101+
* Adds a function entry to the span stack.
102+
*
103+
* This allows you to eventually retrieve the function entry stack by using {@link TelemetryTracer.getFunctionStack()},
104+
* which tells you the chain of function executions to bring you to that point in the code.
105+
*
106+
* Example:
107+
* ```
108+
* function a() {
109+
* telemetry.your_Metric.run(() => b(), { functionId: { name: 'a'} })
110+
* }
111+
*
112+
* function b() {
113+
* telemetry.your_Metric.run(() => c(), { functionId: { name: 'b'} })
114+
* }
115+
*
116+
* function c() {
117+
* telemetry.your_Metric.run(() => {
118+
* const stack = telemetry.getFunctionStack()
119+
* console.log(stack) // [ {source: 'a' }, { source: 'b' }, { source: 'c' }]
120+
* }, { functionId: { name: 'c'} })
121+
* }
122+
* ```
123+
*/
124+
functionId?: FunctionEntry
125+
}
126+
93127
/**
94128
* A span represents a "unit of work" captured for logging/telemetry.
95129
* It can contain other spans recursively, then it's called a "trace" or "flow".
@@ -99,6 +133,7 @@ function getValidatedState(state: Partial<MetricBase>, definition: MetricDefinit
99133
*/
100134
export class TelemetrySpan<T extends MetricBase = MetricBase> {
101135
#startTime?: Date
136+
#options: SpanOptions
102137

103138
private readonly state: Partial<T> = {}
104139
private readonly definition = definitions[this.name] ?? {
@@ -113,7 +148,17 @@ export class TelemetrySpan<T extends MetricBase = MetricBase> {
113148
*/
114149
static readonly #excludedFields = ['passive', 'value']
115150

116-
public constructor(public readonly name: string) {}
151+
public constructor(
152+
public readonly name: string,
153+
options?: SpanOptions
154+
) {
155+
// set defaults on undefined options
156+
this.#options = {
157+
// do emit by default
158+
emit: options?.emit === undefined ? true : options.emit,
159+
functionId: options?.functionId,
160+
}
161+
}
117162

118163
public get startTime(): Date | undefined {
119164
return this.#startTime
@@ -124,6 +169,10 @@ export class TelemetrySpan<T extends MetricBase = MetricBase> {
124169
return this
125170
}
126171

172+
public getFunctionEntry(): Readonly<FunctionEntry> | undefined {
173+
return this.#options.functionId
174+
}
175+
127176
public emit(data?: Partial<T>): void {
128177
const state = getValidatedState({ ...this.state, ...data }, this.definition)
129178
const metadata = Object.entries(state)
@@ -157,14 +206,16 @@ export class TelemetrySpan<T extends MetricBase = MetricBase> {
157206
public stop(err?: unknown): void {
158207
const duration = this.startTime !== undefined ? globals.clock.Date.now() - this.startTime.getTime() : undefined
159208

160-
this.emit({
161-
duration,
162-
result: getTelemetryResult(err),
163-
reason: getTelemetryReason(err),
164-
reasonDesc: getTelemetryReasonDesc(err),
165-
requestId: getRequestId(err),
166-
httpStatusCode: getHttpStatusCode(err),
167-
} as Partial<T>)
209+
if (this.#options.emit) {
210+
this.emit({
211+
duration,
212+
result: getTelemetryResult(err),
213+
reason: getTelemetryReason(err),
214+
reasonDesc: getTelemetryReasonDesc(err),
215+
requestId: getRequestId(err),
216+
httpStatusCode: getHttpStatusCode(err),
217+
} as Partial<T>)
218+
}
168219

169220
this.#startTime = undefined
170221
}
@@ -256,8 +307,12 @@ export class TelemetryTracer extends TelemetryBase {
256307
* All changes made to {@link attributes} (via {@link record}) during the execution are
257308
* reverted after the execution completes.
258309
*/
259-
public run<T, U extends MetricName>(name: U, fn: (span: Metric<MetricShapes[U]>) => T): T {
260-
const span = this.createSpan(name).start()
310+
public run<T, U extends MetricName>(
311+
name: U,
312+
fn: (span: Metric<MetricShapes[U]>) => T,
313+
options?: SpanOptions | undefined
314+
): T {
315+
const span = this.createSpan(name, options).start()
261316
const frame = this.switchContext(span)
262317

263318
try {
@@ -301,6 +356,32 @@ export class TelemetryTracer extends TelemetryBase {
301356
return this.#context.run(frame, fn)
302357
}
303358

359+
/**
360+
* Returns the stack of all {@link FunctionEntry}s with the 0th
361+
* index being the top level call, and the last index being the final
362+
* nested call.
363+
*
364+
* Ensure that {@link TelemetryTracer.runWithCallEntry()} and/or {@link TelemetrySpan.recordCallEntry()}
365+
* have been used before this method is called, otherwise it will return
366+
* no useful information.
367+
*
368+
* Use {@link asStringifiedStack} to create a stringified version of this stack.
369+
*/
370+
public getFunctionStack(): FunctionEntry[] {
371+
const stack: FunctionEntry[] = []
372+
const endIndex = this.spans.length - 1
373+
let i = endIndex
374+
while (i >= 0) {
375+
const span = this.spans[i]
376+
const entry = span.getFunctionEntry()
377+
if (entry) {
378+
stack.push(entry)
379+
}
380+
i -= 1
381+
}
382+
return stack
383+
}
384+
304385
/**
305386
* Wraps a function with {@link run}.
306387
*
@@ -322,7 +403,7 @@ export class TelemetryTracer extends TelemetryBase {
322403
name,
323404
emit: (data) => getSpan().emit(data),
324405
record: (data) => getSpan().record(data),
325-
run: (fn) => this.run(name as MetricName, fn),
406+
run: (fn, options?: SpanOptions) => this.run(name as MetricName, fn, options),
326407
increment: (data) => getSpan().increment(data),
327408
}
328409
}
@@ -331,8 +412,8 @@ export class TelemetryTracer extends TelemetryBase {
331412
return this.spans.find((s) => s.name === name) ?? this.createSpan(name)
332413
}
333414

334-
private createSpan(name: string): TelemetrySpan {
335-
const span = new TelemetrySpan(name).record(this.attributes ?? {})
415+
private createSpan(name: string, options?: SpanOptions): TelemetrySpan {
416+
const span = new TelemetrySpan(name, options).record(this.attributes ?? {})
336417
if (this.activeSpan && this.activeSpan.name !== rootSpanName) {
337418
return span.record({ parentMetric: this.activeSpan.name } satisfies { parentMetric: string } as any)
338419
}
@@ -349,3 +430,62 @@ export class TelemetryTracer extends TelemetryBase {
349430
return ctx
350431
}
351432
}
433+
434+
/**
435+
* A Function Entry is a single entry in to a stack of Function Entries.
436+
*
437+
* Think of a Function Entry as one entry in the stack trace of an Error.
438+
* So a stack of Function Entries will allows you to build a path of functions.
439+
* This can allow you to trace the path of executions.
440+
*
441+
* In MOST cases, a Function Entry will represent a method/function call, but it is not
442+
* limited to that.
443+
*/
444+
export type FunctionEntry = {
445+
/**
446+
* An identifier that represents the callback. You'll probably want to use the function name.
447+
*/
448+
readonly name: string
449+
450+
/**
451+
* If the source is a method, you'll want to include the class name for better context.
452+
*/
453+
readonly class?: string
454+
}
455+
456+
/**
457+
* Returns a stringified version of the provided {@link ExecutionContext.stack}.
458+
*
459+
* Eg: "TestClassA1#methodA,methodB:TestClassA2#methodX,methodY,thisIsAlsoZ:someFunction"
460+
*
461+
* - '#' separates a class from its methods
462+
* - ',' separates methods of the same class
463+
* - ':' separates classes/functions
464+
* - The call stack goes in order from left to right
465+
* - The first item in the string is the top level, initial caller in the stack
466+
* - The last item is the final caller in the stack
467+
*
468+
* See tests for examples.
469+
*/
470+
export function asStringifiedStack(stack: FunctionEntry[]): string {
471+
let prevEntry: FunctionEntry | undefined
472+
let currString: string = ''
473+
474+
// Iterate over each entry, appending the source and class to the final output string
475+
for (const currEntry of stack) {
476+
const prevClass = prevEntry?.class
477+
const newClass = currEntry.class
478+
479+
if (prevClass && prevClass === newClass) {
480+
// The new class is same as the prev class, so we don't need to add the class since it already exists
481+
currString = `${currString},${currEntry.name}`
482+
} else {
483+
// The new class may be different from the prev class, so start a new subsection, adding the new class if it exists.
484+
currString = `${currString ? currString + ':' : ''}${newClass ? newClass + '#' : ''}${currEntry.name}`
485+
}
486+
487+
prevEntry = currEntry
488+
}
489+
490+
return currString
491+
}

packages/core/src/shared/telemetry/telemetry.ts

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
* SPDX-License-Identifier: Apache-2.0
44
*/
55

6-
import { TelemetryTracer } from './spans'
6+
import { SpanOptions, TelemetryTracer } from './spans'
77
import { NumericKeys } from '../utilities/tsUtils'
88

99
// This file makes it so you can import 'telemetry' and not 'telemetry.gen'
@@ -15,9 +15,15 @@ export function millisecondsSince(date: Date): number {
1515

1616
export const telemetry = new TelemetryTracer()
1717

18-
// TODO: move this into `Metric` by updating the codegen
18+
/**
19+
* The following are overrides or additions to the actual './telemetry.gen'
20+
*
21+
* This is not a permanent solution, but a temporary override.
22+
* Look to add any permanent solutions to: https://github.com/aws/aws-toolkit-common/tree/main/telemetry/vscode
23+
*/
1924
declare module './telemetry.gen' {
2025
interface Metric<T extends MetricBase = MetricBase> {
2126
increment(data: { [P in NumericKeys<T>]+?: number }): void
27+
run<U>(fn: (span: this) => U, options?: SpanOptions): U
2228
}
2329
}

packages/core/src/shared/telemetry/vscodeTelemetry.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1078,6 +1078,11 @@
10781078
"required": false
10791079
}
10801080
]
1081+
},
1082+
{
1083+
"name": "function_call",
1084+
"description": "Represents a function call",
1085+
"metadata": []
10811086
}
10821087
]
10831088
}

0 commit comments

Comments
 (0)