-
Notifications
You must be signed in to change notification settings - Fork 344
[SVLS-7168] Create inferred Span for GCP Push Subscriptions (including Cloud Events) #6233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Please enter the commit message for your changes. Lines starting
Overall package sizeSelf size: 11.38 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.7.0 | 35.02 MB | 35.02 MB | | @datadog/native-appsec | 10.1.0 | 20.37 MB | 20.37 MB | | @datadog/native-iast-taint-tracking | 4.0.0 | 11.72 MB | 11.73 MB | | @datadog/pprof | 5.9.0 | 9.77 MB | 10.14 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.5.3 | 2.95 MB | 5.6 MB | | @datadog/wasm-js-rewriter | 4.0.1 | 2.85 MB | 3.58 MB | | @datadog/native-metrics | 3.1.1 | 1.02 MB | 1.43 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | jsonpath-plus | 10.3.0 | 617.18 kB | 1.08 MB | | import-in-the-middle | 1.14.2 | 122.36 kB | 850.93 kB | | lru-cache | 10.4.3 | 804.3 kB | 804.3 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 7.0.5 | 63.38 kB | 63.38 kB | | istanbul-lib-coverage | 3.2.2 | 34.37 kB | 34.37 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | shell-quote | 1.8.3 | 23.74 kB | 23.74 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | mutexify | 1.4.0 | 5.71 kB | 8.74 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.4 | 3.96 kB | 3.96 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #6233 +/- ##
==========================================
- Coverage 83.28% 82.80% -0.49%
==========================================
Files 478 479 +1
Lines 19790 19971 +181
==========================================
+ Hits 16482 16536 +54
- Misses 3308 3435 +127 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
BenchmarksBenchmark execution time: 2025-08-11 15:38:03 Comparing candidate commit b056b6a in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 1271 metrics, 52 unstable metrics. |
const spanTags = { | ||
component: 'google-cloud-pubsub', | ||
'span.kind': 'consumer', | ||
'gcloud.project_id': projectId || 'unknown', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my eduction: is this || unknown
a standard pattern for span tags? how do we decide between setting the tag to "unknown" vs not setting the tag at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't usually add a fallback when a value is not available. Leaving it undefined will simply not add the tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed unknown
from the tags
function parseCloudEventMessage (json, req) { | ||
// Eventarc only uses Binary Content Mode with ce-specversion header | ||
// Payload structure: {"message": {...}, "subscription": "..."} | ||
const message = json.message || json | ||
const attrs = { ...message?.attributes } | ||
const subscription = json.subscription || req.headers['ce-subscription'] || 'cloud-event-subscription' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This server.js file
is a generic HTTP server instrumentation, so we shouldn't be putting Pub/Sub or Eventarc-specific logic directly in this file. Can you move that logic into a separate file/module?
Maybe you can do something like this (this code was suggested by ChatGPT):
addHook({ name: 'http' }, http => {
shimmer.wrap(http.Server.prototype, 'emit', wrapEmit)
if (isGCPEnvironment()) {
require('./pubsub').patch(http)
}
return http
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do this approach because then both files are hooking the same module:
http/server.js: addHook({ name: httpNames }, http => { shimmer.wrap(http.Server.prototype, 'emit', wrapEmit) })
gcp-pubsub-push.js: addHook({ name: ['http', 'node:http'] }, (http) => { shimmer.wrap(http.Server.prototype, 'emit', wrapEmitForGcp) })
Both hooks would execute and try to wrap emit on the same prototype. The second wrap overwrites the first and replaces the method, so only the last loaded hook would work. Instead of creating another addHook({ name: 'http' }, http => {
I can move the logic to another file that will still be referenced in server.js
addHook this will allow the wrapEmitForGcp
to extend the other shimmer wrapEmit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. We could still add the hook in http/server.js
, but delegate as much GCP specific logic as we can to a helper function in a separate file. Might be worth reaching out to #guild-dd-node-js to see if someone there could help with best practices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reached out in slack (apm-serverless too) for a review and http/server.js
has a hook already so i just added my code to extend that hook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way things typically work is that each integration is split into 2 components:
- The instrumentation, which is responsible for patching and collecting data, which it can then publish on channels. This part should be generic and contain no code specific to Datadog. I like to think of this as something that should be exposed by the library itself so that we wouldn't have to patch it in the first place, and Datadog specific code would never exist in the library itself (or a built-in module in this case)
- Plugins, which can then subscribe to the channels exposed by the instrumentation and handle Datadog specific things.
Usually there is only a single instrumentation since its responsibility is generic, and there can be multiple plugins (for example per product, or sometimes hierarchical for more complex integrations). So here, any information not already collected by the http
instrumentation should be added but only that. Any actual logic should be moved to a new plugin that handles everything else.
@@ -26,6 +26,10 @@ addHook({ | |||
versions: ['>=1.4.0 <1.20.0'] | |||
}, read => { | |||
return shimmer.wrapFunction(read, read => function (req, res, next) { | |||
// Skip body parsing if PubSub/Cloud Event middleware already parsed it | |||
if (req._pubsubBodyParsed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really need to know about PubSub specifically? Wouldn't generically checking if req.body
exists be enough?
function parseCloudEventMessage (json, req) { | ||
// Eventarc only uses Binary Content Mode with ce-specversion header | ||
// Payload structure: {"message": {...}, "subscription": "..."} | ||
const message = json.message || json | ||
const attrs = { ...message?.attributes } | ||
const subscription = json.subscription || req.headers['ce-subscription'] || 'cloud-event-subscription' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way things typically work is that each integration is split into 2 components:
- The instrumentation, which is responsible for patching and collecting data, which it can then publish on channels. This part should be generic and contain no code specific to Datadog. I like to think of this as something that should be exposed by the library itself so that we wouldn't have to patch it in the first place, and Datadog specific code would never exist in the library itself (or a built-in module in this case)
- Plugins, which can then subscribe to the channels exposed by the instrumentation and handle Datadog specific things.
Usually there is only a single instrumentation since its responsibility is generic, and there can be multiple plugins (for example per product, or sometimes hierarchical for more complex integrations). So here, any information not already collected by the http
instrumentation should be added but only that. Any actual logic should be moved to a new plugin that handles everything else.
const spanTags = { | ||
component: 'google-cloud-pubsub', | ||
'span.kind': 'consumer', | ||
'gcloud.project_id': projectId || 'unknown', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't usually add a fallback when a value is not available. Leaving it undefined will simply not add the tag.
What does this PR do?
Extends the official PubSub plugin capabilities and optimizes Datadog tracing instrumentation for Google Cloud Pub/Sub push messages and Cloud Events across all Node.js web frameworks.
GCP sends push → Express server receives POST request
server.js detects → "This is a Pub/Sub push!" (via headers)
server.js creates → PubSub span with distributed tracing
express.js creates → Express span as child of PubSub span
main.js processes → Business logic with automatic tracing
This makes the spans have this hierarchy :
pubsub.receive (created by gcp-pubsub-push.js)
└── http.request (manually created as child)
└── express.request (inherits via scope activation)
└── express.middleware (inherits via scope activation)
└── your-business-logic (main.js)
Follow-up PR will be opened to create the synthetic span in the flame graph
Motivation
An inferred span for the HTTP push post to the Cloud Run service from a pub/sub topic
Example full Push Direct Span of a cloud run service triggering another service using a direct push subscription
Example full Eventarc pubsub trigger span of a cloud run service triggering another service using an Eventarc cloud event trigger
Plugin Checklist
Additional Notes
The changes had to be made in server.js because in Push Subscription, the Cloud Run Service receives HTTP Requests.
Additional information can be found in this doc