refactor: added error stack replacement logic #2167

abhilash-sivan · 2025-11-21T06:31:28Z

refs https://jsw.ibm.com/browse/INSTA-63749

Why

By doing this refactor now, we avoid duplicating stack-handling logic across different instrumentations (DB, messaging, protocol, etc.). This makes future changes easy, more consistent, and more maintainable.

What

This PR is the first step toward implementing the “Error only stack trace filtering" feature. It lays the foundation by centralising the logic around how we set span stack traces, specifically:

Bringing code for span.stack and span.data.technology.error into a common utility on error cases.

Making it easier to apply future filtering or removal of stacks based on configuration or environment variables.

Tasks detailed:

Extracts a common function to handle span stack-trace setting and custom replacements.
Adds logic to overwrite the span’s stack with span.technology.error.stack when relevant.
Leaves common fn open for later conditional filtering or removal based on config / environment variables.
Refactors existing code to call this new function where needed (partial — instrumentation update is not yet complete).

Future Plan (Next PRs)

I plan to split the work into 2–3 follow-up PRs:

Instrumentation Updates (current PR)

Update all existing instrumentations (DB, messaging, protocol, etc.) to use this common stack-setting function.

Add at least one test per category (DB / messaging / protocol) to ensure consistent behavior.

Feature Implementation

Introduce an environment variable / configuration option (INSTANA_STACK_TRACE) to select between: none, error, all.

Implement conditional logic commonly based on this config:

None → do not collect or send stack traces - remove span.stack
Error only → if there is an error, overwrite stack; if not, drop span.stack
All → always collect; if there’s an error, overwrite; otherwise, leave as-is

Add/update tests to cover these modes.

Task list:

Make the setError common across all instrumentation.
v8 string to stack trace conversion + 1 test
refactor getErrodDetails logic to always return string for span.data.tech.error

Optional

Check FUP instr and do not collect stack trace – for performance optimisation.

After implementation:

Public doc update

…logy.stack

kirrg001 · 2025-11-24T08:31:09Z

packages/core/src/tracing/instrumentation/databases/pgNative.js

+    // Note: Instead of 'pg', we could've passed exports.spanName if they were the same,
+    //       We can’t use spanName here because for this instr the span name is
+    //       "postgres", but the data is stored under span.data.pg.
+    tracingUtil.setErrorStack(span, error, 'pg');


exports.spanName

packages/core/src/tracing/instrumentation/protocols/httpClient.js

kirrg001 · 2025-11-24T08:42:45Z

packages/core/src/tracing/tracingUtil.js

+
+  if (technology && span.data[technology]) {
+    // for some cases like http, it is already set with custom values and no need to overwrite the message
+    span.data[technology].error = span.data[technology].error || exports.getErrorDetails(error);


IMO a refactoring PR is needed before this PR to make that behavior equal everywhere:

from

data = { error:

to

span.data.x.error = ...

Then you can remove this extra check here. Its too confusing.

kirrg001 · 2025-11-24T08:48:28Z

packages/core/src/tracing/tracingUtil.js

+  if (error && error.stack) {
+    // no need to consider length for error cases, we can send the whole stack trace as per design
+    // TODO: It will be recorded as string, revisit to change structure
+    span.stack = error.stack;


no need to consider length for error cases

Why? 🤔 I think we have to call a new implementation of getErrorDetails

return String(err.stack).substring(0, 500);

And remove the previous impl.

Also

return String(err.stack).substring(0, 500);

500 will be replaced with the config in the last PR.

The configuration we are adding controls the span.stack length, which should support values like 10 (default) or up to 25 (future implementation plan in final PR).
This setting is meant to limit the number of stack frames, so it applies to an array, not a string.

The current logic for span.data.tech.error uses:

String(err.stack).substring(0, 500)

This truncates a string, which cannot be aligned with a stack-frame limit because the data types don’t match. To enforce a limit on the number of frames (10 or 25), we would need getErrorDetails or a replacement function to return stack traces as an array, which requires custom logic to parse V8 stack output into the expected format. Applying a "25-frame limit" to a string is meaningless and would cut off valid information.

The reason the current implementation returns a string is because it must fall back to err.message when no err.stack exists—which is valid—and err.message is naturally a string. Thus, the return type must stay as a string.

We could apply the limit if error.stack were already an array, but as noted, that wouldn't work consistently because of the fallback case.

So we leave this core logic as untouched.

A possible approach for span.stack is:

span.stack = arrayFormatted(error.stack)

Where arrayFormatted() [need to come up with a good name] converts the V8 stack string into an array so we can safely apply the limit. Meanwhile, span.data.tech.error can continue storing the full, stringified error details, truncated to 500 characters (which seems reasonable).

Also considering

Tracers MAY consider the whole generated stack, disregarding the stack-trace-length configuration, when reporting the span.stack field for erroneous EXIT spans.

limitting is optional in this scenario

Agreed:

In this PR we keep substring 500 on error stack.

An upcoming PR will change error stack to frames

span.stack = arrayFormatted(error.stack)

We should not release before we have the frame limit config applied on error stack as well because customers do already use stack trace length config (applies to span.stack!).

Recommendation: create a parent branch (feat-stack-trace) and merge the single changes into it.

kirrg001 · 2025-11-24T08:57:40Z

packages/core/src/tracing/instrumentation/databases/mysql.js

      kind: constants.EXIT
    });
    span.b = { s: 1 };
    span.stack = tracingUtil.getStackTrace(instrumentedAccessFunction);


Generate/Create only those stack traces which align strictly with the customer's settings

I am missing an action item in your plan for this rule.

IMO we have to move

span.stack = tracingUtil.getStackTrace(instrumentedAccessFunction);

to the success case.

In the fn getStackTrace will then either generate the stack trace or not - based on the customer config (last PR).

kirrg001 · 2025-11-24T08:58:25Z

packages/core/src/tracing/instrumentation/databases/mysql.js

  }

  return cls.ns.runAndReturn(() => {
    const span = cls.startSpan({


Do not create stack traces at all for spans which are already filtered out

I am missing an action item for this rule as well.
How do we achieve that?

feat: added error stack replacement logic - 1

ddf5b97

abhilash-sivan changed the title ~~feat: added error stack replacement logic~~ refactor: added error stack replacement logic Nov 21, 2025

abhilash-sivan added 3 commits November 21, 2025 14:08

chore: update

ab18166

refactor: reused the setErrorStack for span.stack and spa.data.techno…

5bb9017

…logy.stack

chore: avoid hardcoding and use span name

ecb81b7

abhilash-sivan force-pushed the feat-stack-strace-filter branch from 625de88 to ecb81b7 Compare November 21, 2025 12:53

abhilash-sivan added 2 commits November 23, 2025 22:06

test: corrected test for stack replacement

e3ae6b6

chore: update pg instr

918ea3d

kirrg001 requested changes Nov 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: added error stack replacement logic #2167

refactor: added error stack replacement logic #2167

Uh oh!

abhilash-sivan commented Nov 21, 2025 •

edited

Loading

Uh oh!

kirrg001 Nov 24, 2025

Uh oh!

Uh oh!

kirrg001 Nov 24, 2025

Uh oh!

kirrg001 Nov 24, 2025

Uh oh!

kirrg001 Nov 24, 2025

Uh oh!

abhilash-sivan Nov 24, 2025

Uh oh!

abhilash-sivan Nov 24, 2025

Uh oh!

kirrg001 Nov 24, 2025 •

edited

Loading

Uh oh!

kirrg001 Nov 24, 2025 •

edited

Loading

Uh oh!

kirrg001 Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

refactor: added error stack replacement logic #2167

Are you sure you want to change the base?

refactor: added error stack replacement logic #2167

Uh oh!

Conversation

abhilash-sivan commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

Tasks detailed:

Future Plan (Next PRs)

Uh oh!

kirrg001 Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kirrg001 Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

kirrg001 Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

kirrg001 Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

abhilash-sivan Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

abhilash-sivan Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

kirrg001 Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kirrg001 Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kirrg001 Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

abhilash-sivan commented Nov 21, 2025 •

edited

Loading

kirrg001 Nov 24, 2025 •

edited

Loading

kirrg001 Nov 24, 2025 •

edited

Loading