Skip to content

Fix Observability: trace quality, cost reduction, and new metrics#1133

Open
nicacioliveira wants to merge 6 commits intomainfrom
fix/observability-improvements
Open

Fix Observability: trace quality, cost reduction, and new metrics#1133
nicacioliveira wants to merge 6 commits intomainfrom
fix/observability-improvements

Conversation

@nicacioliveira
Copy link
Copy Markdown
Contributor

@nicacioliveira nicacioliveira commented Mar 20, 2026

Observability: trace quality, cost reduction, and new metrics

What was broken (and is now fixed)

The tracer was created via opentelemetry.trace.getTracer() (global API),
which returns a ProxyTracer that delegates to whatever TracerProvider is
currently global. Something overrides the global after our provider.register(),
causing all spans to bypass our span processor entirely — samplers, filters,
and BatchSpanProcessor were effectively ignored.

Fix: use provider.getTracer() directly so spans always go through our
NodeTracerProvider, regardless of global overrides.

Impact: the OTEL_SAMPLING_CONFIG in production (defaultRatio: 0.01)
was not being respected — the service was exporting far more traces than
configured. That is now fixed.


Trace quality: -64% span volume, zero noise

Added FilteringSpanProcessor that drops low-signal spans before export:

  • Framework internals: cache-match, router.ts, fresh.ts, Page.tsx,
    htmx.tsx, requestToParam.ts
  • Sub-5ms child spans
  • Internal loopback fetches (POST http://127.0.0.1/deco/render?...)
  • Error spans always kept regardless of name or duration

Before vs after — same PDP request:

main this PR
Spans per trace ~50 ~23
Noise spans cache-match ×4, requestToParam ×6, router, fresh, Page.tsx, htmx none
Useful spans buried all of them

Head sampler improvements

URLBasedSampler now drops at span creation time (zero memory, zero CPU):

  • Bots: cf-verified-bot: true header and UA regex (crawler, spider,
    monitoring, uptimerobot, etc.) → NOT_RECORD
  • Built-in exclusions: /_frsh/, _liveness, favicons, static assets

New metrics

Metric Type What it tells you
resolver_latency histogram Loader p95 latency — was gated behind a flag, now always on
lru.fill_ratio gauge Cache fullness (0–1). Above 0.9 → eviction pressure / bot flood
lru.hits_total / lru.misses_total counter LRU hit rate at the cache layer
lru.evictions_total counter Spikes signal cache churn (bot attacks)
lru.size_bytes / lru.item_count gauge Cache current state
deno.memory_usage gauge RSS, heap used/total, external (was no-op due to circular dep bug)
redis.errors_total / redis.reconnections_total counter Redis health signals
redis.connected gauge 1/0 live connection state
instance_startup_duration_ms histogram Cold start time per isolate

Bug fixes

  • LRU async dispose: lru-cache does not await async dispose callbacks.
    The previous async dispose caused silent unhandled promise rejections when
    the backing cache delete failed. Now synchronous with explicit .catch().
  • Redis silent errors: all three operations (match, put, delete) were
    swallowing errors with empty .catch(() => {}). Now logged via logger.warn.
  • cache-match orphaned span: was created without a parent context,
    appearing as a root span in traces. Fixed with context.active().
  • Circular dependency: config.tsmetrics.tsconfig.ts caused
    resource before initialization crash. Fixed by extracting resource.ts.
  • deno.memory_usage no-op: InstrumentationBase.setMeterProvider calls
    _updateMetricInstruments() (no-op) instead of enable(), so metrics were
    created on the global no-op meter. Rewritten to use meter directly.
  • Deno 2 double instrumentation: guard added — if OTEL_DENO=true, skip
    provider.register() to avoid duplicate spans, logs, and metrics.
  • Trace exporter protocol: switched from otlp-proto to otlp-http
    many collectors (including HyperDX) reject protobuf on port 4318.

What was already good

The OTel stack foundation was solid: ParentBasedSampler wrapping custom
samplers, BatchSpanProcessor, OTLP export for all three signals
(traces/metrics/logs), semantic conventions on resource attributes,
URL-based sampling config, and the debug sampler for x-trace-debug-id.
Those were left untouched.


Summary by cubic

Improves observability by cutting noisy spans, enforcing sampling, and adding key metrics. Fixes tracer routing/export, reduces log noise, and hardens caches and Redis.

  • New Features

    • FilteringSpanProcessor: drops framework noise, sub‑5ms child spans, and loopback fetches; keeps errors and slow spans; rate‑limits root spans.
    • Head sampler: bot detection and built‑in exclusions for /_frsh/, liveness, favicons, and static assets.
    • Metrics: resolver_latency (always on), instance_startup_duration_ms, deno.memory_usage, deno.open_resources, lru.* (evictions, size, items, fill_ratio, hits, misses), redis.errors_total, redis.reconnections_total, redis.connected.
  • Bug Fixes

    • Tracing now uses provider.getTracer() so spans hit our processors; respects OTEL_SAMPLING_CONFIG. Guard against Deno 2 native OTel (OTEL_DENO=true) and flush on shutdown.
    • DebugSampler only forces when x-trace-debug-id is set or debug mode is enabled.
    • Switched to @opentelemetry/exporter-trace-otlp-http; FetchInstrumentation ignores 127.0.0.1/localhost; reduced log noise with detectResources:false.
    • Loader metrics: always record resolver_latency; use loader file path when resolverId is obj; set status=bypass when revision ID is missing to avoid undefined status.
    • Fixed orphaned cache-match spans by attaching to context.active().
    • Broke circular dep via resource.ts; wired meter provider; Deno runtime metrics now use the meter directly.
    • LRU: dispose is sync with error logging; lru.evictions_total counts only real evictions.
    • Redis: match/put/delete now log and increment redis.errors_total on command errors; redis.connected uses a Set to track clients accurately; reconnections tracked.

Written for commit 36f17b8. Summary will update on new commits.

Summary by CodeRabbit

  • New Features

    • Improved observability: instance startup timing, LRU/Redis cache metrics, runtime resource metrics, and resolver/loader metrics (including consistent latency recording).
    • Smarter trace handling: span filtering, rate-limiting, and URL/bot-aware sampling; expanded sampling options.
  • Bug Fixes

    • Explicit cache bypass path when revision ID is unavailable and corresponding cache metric updates.
  • Chores

    • Reworked OpenTelemetry config/registration and switched OTLP HTTP exporter; various telemetry reliability updates.

decobot and others added 3 commits March 19, 2026 23:15
## Fixes
- fix(otel): use HTTP exporter for traces — proto was rejected by collectors
- fix(otel): flush traces and metrics on process shutdown (data loss on restart)
- fix(otel): guard against Deno 2 native OTel double instrumentation
- fix(otel): break circular dep between config.ts and metrics.ts
- fix(otel): reduce log noise — detectResources:false suppresses K_REVISION warns
- fix(metrics): always record resolver_latency, remove OTEL_ENABLE_EXTRA_METRICS gate
- fix(metrics): use loader file path as fallback when resolverId is 'obj'
- fix(metrics): wire meterProvider into registerInstrumentations (deno metrics were no-op)
- fix(trace): attach cache-match span to active request context (was orphaned root span)
- fix(cache): log Redis errors instead of silently swallowing them
- fix(cache): LRU async dispose bug — lru-cache does not await async callbacks

## New metrics
- feat(metrics): LRU cache observability — lru.evictions_total, lru.size_bytes,
  lru.item_count, lru.fill_ratio, lru.hits_total, lru.misses_total
- feat(metrics): Deno runtime memory — deno.memory_usage (rss/heap/external),
  deno.open_resources by type

## Trace quality & cost reduction
- feat(trace): FilteringSpanProcessor — drops framework noise (cache-match,
  router, fresh, Page.tsx, htmx) and sub-5ms child spans; keeps errors and
  slow spans; token-bucket rate limiter (maxExportPerSecond) prevents export
  explosion under bot traffic
- feat(sampler): bot detection in head sampler — cf-verified-bot header and
  UA regex; bots get NOT_RECORD so zero spans are created, zero cost
- feat(sampler): built-in exclusions for /_frsh/, _liveness, favicons, static
  assets — no spans created for infrastructure noise

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
… noise

- redis.errors_total: counter on connection errors
- redis.reconnections_total: counter on reconnect attempts
- redis.connected: gauge 1/0 tracking live connection state
- instance_startup_duration_ms: histogram measuring time from instance
  start to readyAt — key signal for slow cold start diagnosis
- Remove logger.warn for missing K_REVISION — fires on every cacheable
  request in non-Cloud Run envs, was pure noise with no action to take

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
… directly

Root cause: opentelemetry.trace.getTracer returns a ProxyTracer that
delegates to the *current* global TracerProvider. Another provider was
overriding the global after our provider.register(), causing all spans
to bypass FilteringSpanProcessor entirely.

Fix: use provider.getTracer("deco-tracer") directly when OTel is
enabled, bypassing the global API. FilteringSpanProcessor.onStart/onEnd
are now called for every span.

Also filter loopback fetch spans (POST/GET http://127.0.0.1 or localhost)
which FetchInstrumentation creates for internal /deco/render calls —
these produce high-cardinality span names and have no signal value.

Verified: cache-match, router.ts, fresh.ts, htmx.tsx, Page.tsx no longer
appear in exported traces.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Tagging Options

Should a new tag be published when this PR is merged?

  • 👍 for Patch 1.180.1 update
  • 🎉 for Minor 1.181.0 update
  • 🚀 for Major 2.0.0 update

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 20, 2026

📝 Walkthrough

Walkthrough

Expanded OpenTelemetry integration, added span filtering and tail-sampling controls, moved Deno runtime metrics to module-level callbacks, always record resolver latency, record manifest startup duration, and added OTEL metrics and error handling for LRU and Redis caches.

Changes

Cohort / File(s) Summary
Loader & Manifest
blocks/loader.ts, engine/manifest/manifest.ts
Always record resolver latency; wrapLoader accepts optional loaderKey and prefers ctx.resolverId (with "obj" guard) / loaderKey when deriving loader id; when revisionID is missing set status to bypass, increment cache counter, and return early; record instance_startup_duration_ms using startedAtreadyAt.
OpenTelemetry Core & Exports
deps.ts, observability/otel/config.ts, observability/otel/resource.ts, observability/otel/metrics.ts
Switched OTLP trace exporter to HTTP package export; moved resource/OTEL enablement to resource.ts; conditional diagnostics and Deno-native tracer handling; meterProvider is exported and shutdown on unload; sampling options centralized and passed into processors.
Span Processing
observability/otel/processors/filtering.ts
Added FilteringSpanProcessor wrapper that conditionally forwards spans: keeps errors, drops known framework noise and loopback fetches for root spans, applies token-bucket rate limit for root-span exports, and enforces duration-based rules for non-root spans.
Samplers & Debug
observability/otel/samplers/urlBased.ts, observability/otel/samplers/debug.ts
SamplingOptions extended for tail-sampling (alwaysExportErrors, slowThresholdMs, exportRatio, maxExportPerSecond); URLBasedSampler adds ALWAYS_EXCLUDE patterns and bot detection (header + UA) before sampling; DebugSampler requires state.debugEnabled to force RECORD_AND_SAMPLED via state.correlationId.
Deno Runtime Instrumentation
observability/otel/instrumentation/deno-runtime.ts
Replaced class-based instrumentation with module-level observable instruments and immediate callbacks for deno.memory_usage and deno.open_resources; retained an empty exported DenoRuntimeInstrumentation class for compatibility.
Cache Instrumentation (LRU)
runtime/caches/lrucache.ts, runtime/caches/common.ts
Added OTEL counters/gauges (hits/misses/evictions, size, item count, fill ratio), per-cache registry lruInstances, gauge callbacks, and non-blocking dispose behavior with eviction counting; cache match spans now start with context.active().
Cache Instrumentation (Redis)
runtime/caches/redis.ts
Added Redis metrics (errors, reconnections, connected gauge) backed by connectedClientSet; client event handlers update connectivity and counters; cache ops now log warnings and increment error counters on command failures instead of silently swallowing.
Filtering & Sampling Support Files
observability/otel/samplers/..., observability/otel/processors/filtering.ts
New filtering and sampling behaviors integrated across samplers/processors and config to apply export filtering and rate controls.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • guitavano
  • hugo-ccabral

Poem

🐰 I hop through spans and nibble trees of logs,

I count the caches, dodge the noisy frogs,
Filters tidy trails where traces play,
Redis hums and LRU saves the day,
A carrot bloom of metrics — hop hooray! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main changes: observability improvements including trace quality fixes, cost reduction through span filtering, and new metrics additions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/observability-improvements
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

state.correlationId is always set (UUID generated per request), so the
??fallback made DebugSampler return RECORD_AND_SAMPLED for every request,
completely nullifying the defaultRatio in URLBasedSampler.

In production with defaultRatio:0.01, all requests were being sampled —
this was masked before because the sampler itself was bypassed (the global
TracerProvider override bug, fixed in the previous commit).

Now only force RECORD_AND_SAMPLED when:
- x-trace-debug-id header is explicitly set (programmatic debug)
- state.debugEnabled is true (?__d= param or debug cookie)

The ?__d and x-trace-debug-id debug workflows are unchanged.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 12 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="runtime/caches/redis.ts">

<violation number="1" location="runtime/caches/redis.ts:135">
P2: Command errors are logged but never increment `redis.errors_total`, so the new error metric underreports failures.</violation>
</file>

<file name="runtime/caches/lrucache.ts">

<violation number="1" location="runtime/caches/lrucache.ts:95">
P2: `lru.evictions_total` is overcounted because `dispose` increments it for every removal reason, not just actual evictions.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
runtime/caches/redis.ts (1)

181-190: ⚠️ Potential issue | 🟡 Minor

Fix formatting to pass CI.

The pipeline reports deno fmt formatting differences around lines 184-185. Run deno fmt to fix.

#!/bin/bash
# Check the exact formatting issue
deno fmt --check runtime/caches/redis.ts 2>&1 || true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@runtime/caches/redis.ts` around lines 181 - 190, The formatting in the
serialize promise chain around serialize(...) -> waitOrReject(...) is off and
failing deno fmt; run deno fmt to reformat runtime/caches/redis.ts (or manually
adjust formatting of the serialize(...).then(...).catch(...) block) so the arrow
function and chained calls align with Deno's formatter, ensuring the call to
waitOrReject<string | null>(() => redis?.set(cacheKey, data, "EX", TTL) ??
Promise.resolve(null), COMMAND_TIMEOUT) is properly indented and the catch((err)
=> { logger.warn(`redis cache put error: ${err}`); }) block is formatted; then
commit the formatted file.
🧹 Nitpick comments (4)
runtime/caches/redis.ts (1)

201-204: Consider logging the error details for debugging.

The error handler increments the counter but discards the actual error. Logging the error message would aid debugging Redis connection issues.

♻️ Suggested change to log error details
-      redis.on("error", () => {
+      redis.on("error", (err) => {
         redisConnectionState = 0;
         redisErrors.add(1);
+        logger.warn(`redis connection error: ${err}`);
       });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@runtime/caches/redis.ts` around lines 201 - 204, The current
redis.on("error", ...) handler drops the error object; update the handler to
accept the error parameter and log its details while preserving the existing
behavior (setting redisConnectionState = 0 and redisErrors.add(1)). In other
words, change the callback at redis.on("error", ...) to function (err) { /* log
err (use existing logger like processLogger.error or console.error) */;
redisConnectionState = 0; redisErrors.add(1); } so the actual Redis error
message is recorded for debugging.
observability/otel/instrumentation/deno-runtime.ts (2)

17-21: Consider using ValueType.INT for resource counts.

deno.open_resources represents a discrete count of resources. Using ValueType.DOUBLE works but ValueType.INT would be semantically more accurate for integer counts.

♻️ Proposed fix
 const openResources: ObservableUpDownCounter<Attributes> = meter
   .createObservableUpDownCounter("deno.open_resources", {
-    valueType: ValueType.DOUBLE,
+    valueType: ValueType.INT,
     description: "Number of open resources of a particular type.",
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/otel/instrumentation/deno-runtime.ts` around lines 17 - 21, The
metric openResources created via
meter.createObservableUpDownCounter("deno.open_resources", ...) uses
ValueType.DOUBLE but represents an integer count; change the ValueType from
ValueType.DOUBLE to ValueType.INT in that createObservableUpDownCounter call so
deno.open_resources is recorded as an integer metric (update the ValueType enum
reference in the openResources initialization).

10-51: Metrics are registered unconditionally at module load.

The observable gauges and their callbacks are registered at module scope regardless of whether OpenTelemetry export is enabled (OTEL_IS_ENABLED). While this likely has minimal overhead when no metric reader is attached, consider guarding registration if this module is imported in environments where OTEL is disabled.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/otel/instrumentation/deno-runtime.ts` around lines 10 - 51, The
module currently creates and registers metrics unconditionally (memoryUsage,
openResources and their callbacks gatherMemoryUsage, gatherOpenResources); wrap
the metric creation and the memoryUsage.addCallback(...) and
openResources.addCallback(...) calls in a guard that checks OTEL_IS_ENABLED (or
move that registration into an exported init/register function invoked only when
OTEL is enabled) so these ObservableGauge/ObservableUpDownCounter instruments
and callbacks are only created/registered when OTEL_IS_ENABLED is true.
engine/manifest/manifest.ts (1)

289-297: Move histogram creation outside the finally block.

Creating the histogram instrument inside the finally callback means a new instrument is registered on every context initialization. OpenTelemetry histograms should be created once at module scope and reused. While the SDK may deduplicate by name, this adds unnecessary overhead and is not idiomatic.

♻️ Proposed fix

Move the histogram creation to module scope:

// At module scope (e.g., near other imports)
const instanceStartupHistogram = meter.createHistogram("instance_startup_duration_ms", {
  description: "Time from instance start to first request readiness.",
  unit: "ms",
  valueType: ValueType.DOUBLE,
});

Then in the finally block:

  ctx.runtime = runtimePromise.finally(() => {
    const readyAt = new Date();
    ctx.instance.readyAt = readyAt;
-   meter.createHistogram("instance_startup_duration_ms", {
-     description: "Time from instance start to first request readiness.",
-     unit: "ms",
-     valueType: ValueType.DOUBLE,
-   }).record(readyAt.getTime() - startedAt.getTime());
+   instanceStartupHistogram.record(readyAt.getTime() - startedAt.getTime());
  });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@engine/manifest/manifest.ts` around lines 289 - 297, The histogram is being
created inside the runtimePromise.finally callback which registers a new
instrument on every init; move the meter.createHistogram call to module scope
(e.g., define const instanceStartupHistogram = meter.createHistogram(...) near
the top-level alongside other meter setup) and replace the in-block
createHistogram().record(...) with
instanceStartupHistogram.record(readyAt.getTime() - startedAt.getTime()) inside
the runtimePromise.finally where ctx.instance.readyAt is set; keep the
description/unit/valueType the same so the instrument is reused rather than
recreated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@blocks/loader.ts`:
- Around line 339-342: Initialize the local variable status to a default like
"unknown" and ensure it is explicitly set before any early return (e.g., the
!revisionID branch) and inside a catch handler, so the finally block's
stats.latency.record(performance.now() - start, { loader, status }) and
ctx.monitoring?.currentSpan?.setDesc(status) never receive undefined; add a
try/catch around the main logic to assign a failure status on exceptions and
update the early-return paths to set the appropriate status value before
returning.

In `@observability/otel/config.ts`:
- Around line 109-131: Wrap the metrics initialization and its unload listener
in the same DENO_OTEL_ACTIVE guard used for tracing: change the block that
currently checks only OTEL_IS_ENABLED to require OTEL_IS_ENABLED &&
!DENO_OTEL_ACTIVE, and ensure DENO_OTEL_ACTIVE is available in that module
(either by computing Deno.env.get("OTEL_DENO") === "true" or importing the
constant from the tracing config). Specifically update the metrics
initialization block that registers the metric exporter / meter provider and the
addEventListener("unload", ...) so it runs only when OTEL_IS_ENABLED &&
!DENO_OTEL_ACTIVE.

In `@observability/otel/processors/filtering.ts`:
- Around line 19-22: The isLoopbackFetch arrow function's regex/test call is
misformatted for deno fmt; update the formatting so the regex literal and the
.test(name) call conform to deno's line-wrapping rules (e.g., place the regex
and .test(name) on the same line or wrap the expression with parentheses) in the
isLoopbackFetch implementation to eliminate the formatting CI error.
- Around line 76-94: The if-block formatting inside the shouldKeep function is
not formatted per deno fmt: reformat the isRoot(span) conditional block in
shouldKeep so its braces and inner statements follow the project's formatting
style (ensure the opening brace placement and the inner if statements for
ALWAYS_DROP_NAMES/isLoopbackFetch and the token consumption branch are each on
properly formatted lines), then run deno fmt; targets to edit: function
shouldKeep and the isRoot(span) conditional (references: shouldKeep, isRoot,
ALWAYS_DROP_NAMES, isLoopbackFetch, consumeToken).
- Around line 1-7: The import statements are misformatted and failing deno fmt;
consolidate and reformat the imports so types from the same module are grouped
and each import follows deno's style. Specifically, group ReadableSpan,
SpanProcessor, and Context into a single `import type { ReadableSpan,
SpanProcessor, Context } from "../../../deps.ts";` keep `SpanStatusCode` as a
regular import from "../../../deps.ts", and keep `SamplingOptions` imported as
`import type { SamplingOptions } from "../samplers/urlBased.ts";` then run `deno
fmt` to ensure CI passes; check the identifiers SpanStatusCode, ReadableSpan,
SpanProcessor, Context, and SamplingOptions when applying the change.

In `@observability/otel/resource.ts`:
- Around line 7-14: The version extraction in tryGetVersionOf is fragile for
scoped packages because splitting on "@" can leave path segments (e.g.,
"deco/apps/0.1.0/"); update tryGetVersionOf (which uses safeImportResolve) to
robustly extract the version by parsing the resolved string for the final path
segment or a semantic version pattern: after getting the resolved locator, take
the substring after the last "/" (or last "@") and trim any trailing slashes, or
match a semantic version regex (e.g., /\d+\.\d+\.\d+(-\S+)?/) and return that
match; keep the same try/catch and return undefined on failure.

In `@runtime/caches/lrucache.ts`:
- Around line 48-51: The description string for the observable gauge
lruFillRatioGauge (created via meter.createObservableGauge with
ValueType.DOUBLE) is misformatted for deno fmt; update the description to follow
project formatting conventions (e.g., wrap or reflow the string so it fits line
width, ensure consistent spacing and trailing comma placement) and run deno fmt
to verify; adjust the meter.createObservableGauge argument object so the keys
and values (valueType and description) are properly aligned and the description
text is a single well-formatted string.
- Around line 95-100: The CI failure is due to formatting around the dispose
handler's logger.warn call; update the dispose implementation (dispose: (_value:
boolean, key: string) => { ... }) to match deno fmt style by reformatting the
logger.warn invocation that logs the cache deletion error (the call referencing
logger.warn, cache.delete and cacheName) so the message string and the metadata
object are properly indented/line-broken (ensure trailing commas and closing
paren are placed as deno fmt expects) and then run deno fmt to confirm the
change.

In `@runtime/caches/redis.ts`:
- Around line 25-34: The module-level redisConnectionState used by
redisConnected.addCallback is overwritten by every Redis client created in
caches.open, so the gauge only reflects the last event; change to track state
per client (e.g., a Map keyed by client id/namespace set by caches.open) or
maintain an aggregate (e.g., any-connected or count-of-connected) and have the
redisConnected.addCallback observe that aggregate instead of the single
redisConnectionState; update the event handlers in caches.open that currently
assign redisConnectionState to update the per-client Map or aggregate and ensure
redisConnected.addCallback computes and observes the aggregated value.

---

Outside diff comments:
In `@runtime/caches/redis.ts`:
- Around line 181-190: The formatting in the serialize promise chain around
serialize(...) -> waitOrReject(...) is off and failing deno fmt; run deno fmt to
reformat runtime/caches/redis.ts (or manually adjust formatting of the
serialize(...).then(...).catch(...) block) so the arrow function and chained
calls align with Deno's formatter, ensuring the call to waitOrReject<string |
null>(() => redis?.set(cacheKey, data, "EX", TTL) ?? Promise.resolve(null),
COMMAND_TIMEOUT) is properly indented and the catch((err) => {
logger.warn(`redis cache put error: ${err}`); }) block is formatted; then commit
the formatted file.

---

Nitpick comments:
In `@engine/manifest/manifest.ts`:
- Around line 289-297: The histogram is being created inside the
runtimePromise.finally callback which registers a new instrument on every init;
move the meter.createHistogram call to module scope (e.g., define const
instanceStartupHistogram = meter.createHistogram(...) near the top-level
alongside other meter setup) and replace the in-block
createHistogram().record(...) with
instanceStartupHistogram.record(readyAt.getTime() - startedAt.getTime()) inside
the runtimePromise.finally where ctx.instance.readyAt is set; keep the
description/unit/valueType the same so the instrument is reused rather than
recreated.

In `@observability/otel/instrumentation/deno-runtime.ts`:
- Around line 17-21: The metric openResources created via
meter.createObservableUpDownCounter("deno.open_resources", ...) uses
ValueType.DOUBLE but represents an integer count; change the ValueType from
ValueType.DOUBLE to ValueType.INT in that createObservableUpDownCounter call so
deno.open_resources is recorded as an integer metric (update the ValueType enum
reference in the openResources initialization).
- Around line 10-51: The module currently creates and registers metrics
unconditionally (memoryUsage, openResources and their callbacks
gatherMemoryUsage, gatherOpenResources); wrap the metric creation and the
memoryUsage.addCallback(...) and openResources.addCallback(...) calls in a guard
that checks OTEL_IS_ENABLED (or move that registration into an exported
init/register function invoked only when OTEL is enabled) so these
ObservableGauge/ObservableUpDownCounter instruments and callbacks are only
created/registered when OTEL_IS_ENABLED is true.

In `@runtime/caches/redis.ts`:
- Around line 201-204: The current redis.on("error", ...) handler drops the
error object; update the handler to accept the error parameter and log its
details while preserving the existing behavior (setting redisConnectionState = 0
and redisErrors.add(1)). In other words, change the callback at
redis.on("error", ...) to function (err) { /* log err (use existing logger like
processLogger.error or console.error) */; redisConnectionState = 0;
redisErrors.add(1); } so the actual Redis error message is recorded for
debugging.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0618fc5b-ff58-4d3b-971d-792074a12023

📥 Commits

Reviewing files that changed from the base of the PR and between ac02ba0 and 61ed616.

📒 Files selected for processing (12)
  • blocks/loader.ts
  • deps.ts
  • engine/manifest/manifest.ts
  • observability/otel/config.ts
  • observability/otel/instrumentation/deno-runtime.ts
  • observability/otel/metrics.ts
  • observability/otel/processors/filtering.ts
  • observability/otel/resource.ts
  • observability/otel/samplers/urlBased.ts
  • runtime/caches/common.ts
  • runtime/caches/lrucache.ts
  • runtime/caches/redis.ts

Comment on lines 339 to 342
} finally {
const dimension = { loader, status };
if (OTEL_ENABLE_EXTRA_METRICS) {
stats.latency.record(performance.now() - start, dimension);
}
stats.latency.record(performance.now() - start, dimension);
ctx.monitoring?.currentSpan?.setDesc(status);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify all status assignments and latency emission context in blocks/loader.ts
rg -n -C3 'let status|status =|if \(!revisionID\)|stats\.latency\.record|setDesc\(status\)' blocks/loader.ts

Repository: deco-cx/deco

Length of output: 1907


🏁 Script executed:

rg -n -B 30 'let status:' blocks/loader.ts | head -50

Repository: deco-cx/deco

Length of output: 1136


🏁 Script executed:

rg -n -A 50 'let status:' blocks/loader.ts | tail -100

Repository: deco-cx/deco

Length of output: 1968


🏁 Script executed:

sed -n '223,343p' blocks/loader.ts

Repository: deco-cx/deco

Length of output: 4122


Avoid emitting resolver_latency with status: undefined.

The early return at line 271 (when !revisionID) does not assign a value to status before exiting, causing the finally block to unconditionally record latency metrics with status: undefined. Additionally, if any exception occurs before a status assignment, the same issue occurs since there is no catch handler.

Initialize status to a default value (e.g., "unknown") and explicitly set it before early returns and in a catch block to ensure all code paths emit consistent metric attributes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@blocks/loader.ts` around lines 339 - 342, Initialize the local variable
status to a default like "unknown" and ensure it is explicitly set before any
early return (e.g., the !revisionID branch) and inside a catch handler, so the
finally block's stats.latency.record(performance.now() - start, { loader, status
}) and ctx.monitoring?.currentSpan?.setDesc(status) never receive undefined; add
a try/catch around the main logic to assign a failure status on exceptions and
update the early-return paths to set the appropriate status value before
returning.

Comment on lines +109 to 131
// Deno 2.2+ has built-in OTel support via OTEL_DENO=true.
// If enabled, it sets its own global TracerProvider and instruments fetch/console,
// which would conflict with our manual setup (double spans, double logs).
// Skip provider.register() when Deno's native OTel is active.
const DENO_OTEL_ACTIVE = Deno.env.get("OTEL_DENO") === "true";


if (OTEL_IS_ENABLED && !DENO_OTEL_ACTIVE) {
const traceExporter = new OTLPTraceExporter();
// @ts-ignore: no idea why this is failing, but it should work
provider.addSpanProcessor(new BatchSpanProcessor(traceExporter));
provider.addSpanProcessor(
new FilteringSpanProcessor(
// @ts-ignore: no idea why this is failing, but it should work
new BatchSpanProcessor(traceExporter),
samplingOptions,
),
);

provider.register();

addEventListener("unload", () => {
provider.shutdown().catch(() => {});
});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent DENO_OTEL_ACTIVE guard between config.ts and metrics.ts.

The tracer provider registration correctly skips when DENO_OTEL_ACTIVE is true, but metrics.ts (lines 45-60 per context snippet) registers its own unload listener unconditionally when OTEL_IS_ENABLED without checking DENO_OTEL_ACTIVE. This could cause conflicts when Deno's native OTel is active.

Consider applying the same guard in metrics.ts for consistency:

// In metrics.ts
if (OTEL_IS_ENABLED && !DENO_OTEL_ACTIVE) {
  // ... metric exporter setup and unload listener
}
#!/bin/bash
# Check if metrics.ts has DENO_OTEL_ACTIVE guard
rg -n "DENO_OTEL_ACTIVE|OTEL_DENO" observability/otel/metrics.ts
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/otel/config.ts` around lines 109 - 131, Wrap the metrics
initialization and its unload listener in the same DENO_OTEL_ACTIVE guard used
for tracing: change the block that currently checks only OTEL_IS_ENABLED to
require OTEL_IS_ENABLED && !DENO_OTEL_ACTIVE, and ensure DENO_OTEL_ACTIVE is
available in that module (either by computing Deno.env.get("OTEL_DENO") ===
"true" or importing the constant from the tracing config). Specifically update
the metrics initialization block that registers the metric exporter / meter
provider and the addEventListener("unload", ...) so it runs only when
OTEL_IS_ENABLED && !DENO_OTEL_ACTIVE.

Comment on lines +1 to +7
import { SpanStatusCode } from "../../../deps.ts";
import type {
ReadableSpan,
SpanProcessor,
} from "../../../deps.ts";
import type { Context } from "../../../deps.ts";
import type { SamplingOptions } from "../samplers/urlBased.ts";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix import formatting to pass CI.

Pipeline reports deno fmt differences around lines 2-5 for import formatting.

♻️ Suggested formatting fix
 import { SpanStatusCode } from "../../../deps.ts";
-import type {
-  ReadableSpan,
-  SpanProcessor,
-} from "../../../deps.ts";
-import type { Context } from "../../../deps.ts";
+import type { Context, ReadableSpan, SpanProcessor } from "../../../deps.ts";
 import type { SamplingOptions } from "../samplers/urlBased.ts";
🧰 Tools
🪛 GitHub Actions: ci

[error] 2-5: deno fmt reported formatting differences in observability/otel/processors/filtering.ts (import formatting around lines 2-5).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/otel/processors/filtering.ts` around lines 1 - 7, The import
statements are misformatted and failing deno fmt; consolidate and reformat the
imports so types from the same module are grouped and each import follows deno's
style. Specifically, group ReadableSpan, SpanProcessor, and Context into a
single `import type { ReadableSpan, SpanProcessor, Context } from
"../../../deps.ts";` keep `SpanStatusCode` as a regular import from
"../../../deps.ts", and keep `SamplingOptions` imported as `import type {
SamplingOptions } from "../samplers/urlBased.ts";` then run `deno fmt` to ensure
CI passes; check the identifiers SpanStatusCode, ReadableSpan, SpanProcessor,
Context, and SamplingOptions when applying the change.

Comment on lines +7 to +14
const tryGetVersionOf = (pkg: string) => {
try {
const [_, ver] = safeImportResolve(pkg).split("@");
return ver.substring(0, ver.length - 1);
} catch {
return undefined;
}
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check what safeImportResolve returns for these packages
rg -n "safeImportResolve" --type ts -A2 -B2

Repository: deco-cx/deco

Length of output: 1790


🌐 Web query:

JSR package resolution URL format scoped packages @

💡 Result:

The URL format for resolving individual modules (files) from scoped JSR packages via the JSR registry API is: https://jsr.io/@&lt;scope&gt;/&lt;package-name&gt;/&lt;version&gt;/&lt;path> For example, the /main.ts file from package @luca/flag@1.0.0 is at https://jsr.io/@luca/flag/1.0.0/main.ts. Scoped packages always include the @scope/ prefix in the path. Package metadata is at https://jsr.io/@&lt;scope&gt;/&lt;package-name&gt;/meta.json. For npm compatibility, packages are mapped to @jsr/<scope><package> at https://npm.jsr.io, but direct resolution uses jsr.io. The jsr.json config file specifies package name (e.g. "@scope/pkg"), version, and exports, but resolution follows the above URL pattern post-publishing.

Citations:


Version extraction is fragile for scoped packages.

The version parsing at line 9 splits by @ and takes the second element. For scoped JSR packages (e.g., jsr:@deco/apps/0.1.0/``), split("@") produces `["jsr:", "deco/apps/0.1.0/"]`, so `ver` becomes `"deco/apps/0.1.0/"` rather than just the version. The approach doesn't reliably extract only the version number.

Consider using a more robust method to extract the version after the final / (or @ if present):

🔧 Proposed fix
 const tryGetVersionOf = (pkg: string) => {
   try {
-    const [_, ver] = safeImportResolve(pkg).split("@");
-    return ver.substring(0, ver.length - 1);
+    const resolved = safeImportResolve(pkg);
+    const lastSlash = resolved.lastIndexOf("/");
+    if (lastSlash === -1) return undefined;
+    const ver = resolved.substring(lastSlash + 1);
+    return ver || undefined;
   } catch {
     return undefined;
   }
 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const tryGetVersionOf = (pkg: string) => {
try {
const [_, ver] = safeImportResolve(pkg).split("@");
return ver.substring(0, ver.length - 1);
} catch {
return undefined;
}
};
const tryGetVersionOf = (pkg: string) => {
try {
const resolved = safeImportResolve(pkg);
const lastSlash = resolved.lastIndexOf("/");
if (lastSlash === -1) return undefined;
const ver = resolved.substring(lastSlash + 1);
return ver || undefined;
} catch {
return undefined;
}
};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/otel/resource.ts` around lines 7 - 14, The version extraction
in tryGetVersionOf is fragile for scoped packages because splitting on "@" can
leave path segments (e.g., "deco/apps/0.1.0/"); update tryGetVersionOf (which
uses safeImportResolve) to robustly extract the version by parsing the resolved
string for the final path segment or a semantic version pattern: after getting
the resolved locator, take the substring after the last "/" (or last "@") and
trim any trailing slashes, or match a semantic version regex (e.g.,
/\d+\.\d+\.\d+(-\S+)?/) and return that match; keep the same try/catch and
return undefined on failure.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@observability/otel/samplers/debug.ts`:
- Around line 32-33: The current assignment to correlationId uses the nullish
coalescing operator so an empty string from req.headers.get("x-trace-debug-id")
will block fallback to state.debugEnabled; change the logic in debug.ts so
correlationId treats an empty header as absent (e.g., check for a truthy
non-empty header value from req.headers.get("x-trace-debug-id") before falling
back to state?.debugEnabled ? state?.correlationId : undefined) to ensure forced
debug sampling via state.debugEnabled isn't suppressed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5d65cbb9-8df4-4610-90ac-24c9353ce5ed

📥 Commits

Reviewing files that changed from the base of the PR and between 61ed616 and 3b6c792.

📒 Files selected for processing (3)
  • observability/otel/samplers/debug.ts
  • runtime/caches/lrucache.ts
  • runtime/caches/redis.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • runtime/caches/lrucache.ts
  • runtime/caches/redis.ts

Comment on lines 32 to +33
const correlationId = req?.headers?.get?.("x-trace-debug-id") ??
state?.correlationId;
(state?.debugEnabled ? state?.correlationId : undefined);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Prevent empty debug header from suppressing forced debug sampling.

Line 32 currently uses ??, so an empty x-trace-debug-id value blocks fallback to state.debugEnabled and may skip intended forced sampling.

Suggested fix
-    const correlationId = req?.headers?.get?.("x-trace-debug-id") ??
-      (state?.debugEnabled ? state?.correlationId : undefined);
+    const debugHeader = req?.headers?.get?.("x-trace-debug-id")?.trim();
+    const correlationId = debugHeader ||
+      (state?.debugEnabled ? state?.correlationId : undefined);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/otel/samplers/debug.ts` around lines 32 - 33, The current
assignment to correlationId uses the nullish coalescing operator so an empty
string from req.headers.get("x-trace-debug-id") will block fallback to
state.debugEnabled; change the logic in debug.ts so correlationId treats an
empty header as absent (e.g., check for a truthy non-empty header value from
req.headers.get("x-trace-debug-id") before falling back to state?.debugEnabled ?
state?.correlationId : undefined) to ensure forced debug sampling via
state.debugEnabled isn't suppressed.

- loader.ts: set status="bypass" on !revisionID early return so finally
  block never emits resolver_latency with status:undefined
- redis.ts: increment redis.errors_total on command errors (delete/match/put);
  track connected clients via Set so redis.connected gauge is correct
  when multiple namespaces open separate clients
- lrucache.ts: only count evict reason in lru.evictions_total (not TTL/delete)
- filtering.ts, lrucache.ts: deno fmt

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@observability/otel/processors/filtering.ts`:
- Around line 51-61: The constructor currently trusts options.slowThresholdMs
and options.maxExportPerSecond directly which can be NaN, negative, or strings
from decoded JSON; validate and sanitize both before assigning to
this.slowThresholdMs, this.maxExportPerSecond, and this.tokens: coerce values to
numbers (e.g., Number(...)), check isFinite and non-negative, fall back to the
existing defaults (2000 for slowThresholdMs, Infinity for maxExportPerSecond)
when invalid, and ensure maxExportPerSecond is at least 0 (or treated as
Infinity when unspecified) so tokens initialization (this.tokens =
this.maxExportPerSecond) cannot be set to NaN or a negative value; update the
constructor logic around SamplingOptions, slowThresholdMs, maxExportPerSecond,
and tokens accordingly.
- Around line 63-67: The onStart method currently declares its parameters as
(span: ReadableSpan, ctx: Context) and uses `@ts-ignore`; change it to use the
SpanProcessor contract by replacing the signature with onStart(...args:
Parameters<SpanProcessor["onStart"]>) and call this.inner.onStart(...args)
(remove the `@ts-ignore`); ensure SpanProcessor is imported/available in the
module so the type reference resolves (this preserves version-safety and matches
the inner.onStart call).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dde4c9f2-b621-40b3-a9dd-bc8793727a57

📥 Commits

Reviewing files that changed from the base of the PR and between 3b6c792 and 36f17b8.

📒 Files selected for processing (4)
  • blocks/loader.ts
  • observability/otel/processors/filtering.ts
  • runtime/caches/lrucache.ts
  • runtime/caches/redis.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • blocks/loader.ts
  • runtime/caches/redis.ts
  • runtime/caches/lrucache.ts

Comment on lines +51 to +61
constructor(
private readonly inner: SpanProcessor,
options: Pick<
SamplingOptions,
"slowThresholdMs" | "maxExportPerSecond"
> = {},
) {
this.slowThresholdMs = options.slowThresholdMs ?? 2000;
this.maxExportPerSecond = options.maxExportPerSecond ?? Infinity;
this.tokens = this.maxExportPerSecond;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate/sanitize runtime sampling numbers before using them.

These values come from decoded JSON and can be malformed at runtime (NaN, negative, string), which can silently disable throttling or drop most roots. Add numeric guards in the constructor.

🛡️ Proposed fix
   ) {
-    this.slowThresholdMs = options.slowThresholdMs ?? 2000;
-    this.maxExportPerSecond = options.maxExportPerSecond ?? Infinity;
+    const slow = options.slowThresholdMs;
+    const perSec = options.maxExportPerSecond;
+
+    this.slowThresholdMs =
+      typeof slow === "number" && Number.isFinite(slow) && slow >= 0
+        ? slow
+        : 2000;
+    this.maxExportPerSecond =
+      typeof perSec === "number" && Number.isFinite(perSec) && perSec >= 0
+        ? perSec
+        : Infinity;
     this.tokens = this.maxExportPerSecond;
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
constructor(
private readonly inner: SpanProcessor,
options: Pick<
SamplingOptions,
"slowThresholdMs" | "maxExportPerSecond"
> = {},
) {
this.slowThresholdMs = options.slowThresholdMs ?? 2000;
this.maxExportPerSecond = options.maxExportPerSecond ?? Infinity;
this.tokens = this.maxExportPerSecond;
}
constructor(
private readonly inner: SpanProcessor,
options: Pick<
SamplingOptions,
"slowThresholdMs" | "maxExportPerSecond"
> = {},
) {
const slow = options.slowThresholdMs;
const perSec = options.maxExportPerSecond;
this.slowThresholdMs =
typeof slow === "number" && Number.isFinite(slow) && slow >= 0
? slow
: 2000;
this.maxExportPerSecond =
typeof perSec === "number" && Number.isFinite(perSec) && perSec >= 0
? perSec
: Infinity;
this.tokens = this.maxExportPerSecond;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/otel/processors/filtering.ts` around lines 51 - 61, The
constructor currently trusts options.slowThresholdMs and
options.maxExportPerSecond directly which can be NaN, negative, or strings from
decoded JSON; validate and sanitize both before assigning to
this.slowThresholdMs, this.maxExportPerSecond, and this.tokens: coerce values to
numbers (e.g., Number(...)), check isFinite and non-negative, fall back to the
existing defaults (2000 for slowThresholdMs, Infinity for maxExportPerSecond)
when invalid, and ensure maxExportPerSecond is at least 0 (or treated as
Infinity when unspecified) so tokens initialization (this.tokens =
this.maxExportPerSecond) cannot be set to NaN or a negative value; update the
constructor logic around SamplingOptions, slowThresholdMs, maxExportPerSecond,
and tokens accordingly.

Comment on lines +63 to +67
// @ts-ignore: onStart signature varies across SDK versions
onStart(span: ReadableSpan, ctx: Context): void {
// @ts-ignore
this.inner.onStart(span, ctx);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify the SpanProcessor contract and current implementation signature
rg -n -C2 'interface\s+SpanProcessor|onStart\s*\('
rg -n -C2 'class\s+FilteringSpanProcessor|onStart\s*\(' observability/otel/processors/filtering.ts

Repository: deco-cx/deco

Length of output: 975


🏁 Script executed:

#!/bin/bash
# Find deps.ts and check what SpanProcessor is imported from
fd -t f deps.ts | head -5

Repository: deco-cx/deco

Length of output: 101


🏁 Script executed:

#!/bin/bash
# Look at the deps.ts file to understand the SpanProcessor import
cat observability/otel/processors/deps.ts 2>/dev/null || find . -name "deps.ts" -type f | grep -E "observability|otel" | head -3

Repository: deco-cx/deco

Length of output: 38


🏁 Script executed:

#!/bin/bash
# Search for SpanProcessor type definition and its onStart method signature
rg -n "interface\s+SpanProcessor|type\s+SpanProcessor|class\s+SpanProcessor" --type ts

Repository: deco-cx/deco

Length of output: 38


🏁 Script executed:

#!/bin/bash
# Find the observability/otel deps.ts file
find . -path "*/observability/otel*" -name "deps.ts" -o -path "*/observability/otel*" -name "*.ts" | grep -E "(deps|processor)" | head -10

Repository: deco-cx/deco

Length of output: 100


🏁 Script executed:

#!/bin/bash
# Look for the observability/otel directory structure
ls -la observability/otel/ 2>/dev/null || find observability -type d -name otel

Repository: deco-cx/deco

Length of output: 649


🏁 Script executed:

#!/bin/bash
# Read the filtering.ts file to see imports and full context
head -30 observability/otel/processors/filtering.ts

Repository: deco-cx/deco

Length of output: 1261


🏁 Script executed:

#!/bin/bash
# Check if there's a deps.ts in processors directory or parent
ls -la observability/otel/processors/

Repository: deco-cx/deco

Length of output: 228


🏁 Script executed:

#!/bin/bash
# Check the root deps.ts file for SpanProcessor import
head -50 deps.ts | grep -A 5 -B 5 SpanProcessor || rg -n "SpanProcessor" deps.ts | head -10

Repository: deco-cx/deco

Length of output: 100


🏁 Script executed:

#!/bin/bash
# Get more context from deps.ts around SpanProcessor
rg -n "SpanProcessor|ReadableSpan|@opentelemetry" deps.ts | head -20

Repository: deco-cx/deco

Length of output: 1327


🏁 Script executed:

#!/bin/bash
# Read the full context around onStart in filtering.ts to understand the complete implementation
sed -n '40,80p' observability/otel/processors/filtering.ts

Repository: deco-cx/deco

Length of output: 1313


🏁 Script executed:

#!/bin/bash
# Check if there are other implementations of SpanProcessor to see how onStart is typically implemented
rg -n "implements SpanProcessor" --type ts

Repository: deco-cx/deco

Length of output: 164


🏁 Script executed:

#!/bin/bash
# Check if onStart is called elsewhere with different signatures
rg -n "\.onStart\(" observability/otel/processors/filtering.ts -A 1 -B 1

Repository: deco-cx/deco

Length of output: 121


🏁 Script executed:

#!/bin/bash
# Search for other SpanProcessor implementations to see how onStart is typically used
rg -n "class.*SpanProcessor|\.onStart\(" --type ts | grep -v node_modules | head -20

Repository: deco-cx/deco

Length of output: 245


🏁 Script executed:

#!/bin/bash
# Check where Context is imported from
grep -n "import.*Context" observability/otel/processors/filtering.ts

Repository: deco-cx/deco

Length of output: 106


🏁 Script executed:

#!/bin/bash
# Look for other usages of SpanProcessor methods to see the pattern
rg -n "onEnd|shutdown|forceFlush" observability/otel/processors/filtering.ts -B 1 -A 2

Repository: deco-cx/deco

Length of output: 366


🏁 Script executed:

#!/bin/bash
# Check the complete structure of the filtering.ts file to see all method implementations
cat -n observability/otel/processors/filtering.ts | tail -60

Repository: deco-cx/deco

Length of output: 2036


🏁 Script executed:

#!/bin/bash
# Search for documentation or any type references to SpanProcessor in the codebase
rg -n "SpanProcessor" deps.ts -B 3 -A 3

Repository: deco-cx/deco

Length of output: 554


🌐 Web query:

OpenTelemetry sdk-trace-base 1.25.1 SpanProcessor onStart signature TypeScript

💡 Result:

The TypeScript signature for the onStart method in the SpanProcessor interface from @opentelemetry/sdk-trace-base (including version 1.25.1) is: typescript onStart(span: Span, parentContext: Context): void; This is called when a Span is started, if span.isRecording returns true. The span parameter is the Span that just started (writable/read-write), and parentContext is the parent Context of the span.

Citations:


Use the SpanProcessor contract directly for onStart (remove @ts-ignore).

The current signature uses ReadableSpan (read-only), but the SpanProcessor interface expects Span (writable). This breaks the SDK contract and is masked by @ts-ignore. Use Parameters<SpanProcessor["onStart"]> to stay version-safe and aligned with the interface.

♻️ Proposed fix
-  // `@ts-ignore`: onStart signature varies across SDK versions
-  onStart(span: ReadableSpan, ctx: Context): void {
-    // `@ts-ignore`
-    this.inner.onStart(span, ctx);
+  onStart(...args: Parameters<SpanProcessor["onStart"]>): void {
+    this.inner.onStart(...args);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// @ts-ignore: onStart signature varies across SDK versions
onStart(span: ReadableSpan, ctx: Context): void {
// @ts-ignore
this.inner.onStart(span, ctx);
}
onStart(...args: Parameters<SpanProcessor["onStart"]>): void {
this.inner.onStart(...args);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/otel/processors/filtering.ts` around lines 63 - 67, The onStart
method currently declares its parameters as (span: ReadableSpan, ctx: Context)
and uses `@ts-ignore`; change it to use the SpanProcessor contract by replacing
the signature with onStart(...args: Parameters<SpanProcessor["onStart"]>) and
call this.inner.onStart(...args) (remove the `@ts-ignore`); ensure SpanProcessor
is imported/available in the module so the type reference resolves (this
preserves version-safety and matches the inner.onStart call).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants