feat(e001-s003): trace store with TTL/capacity and tracing metrics by Dumidu1212 · Pull Request #8 · Dumidu1212/Self-Optimizing-Agent-Coordination-Response-System

Dumidu1212 · 2025-10-14T15:24:30Z

Summary by CodeRabbit

New Features
- Plan responses now include a trace ID.
- Added a new endpoint to retrieve execution traces by ID.
- Enhanced planning flow with candidate scoring, better error handling, timeouts, and clearer failure outcomes.
Documentation
- OpenAPI specification version updated to 0.3.0.

coderabbitai · 2025-10-14T15:24:38Z

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title clearly describes the main enhancement—introducing a TTL/capacity‐bounded trace store along with tracing metrics—and uses a conventional commit style. It is concise, avoids vague terms or file lists, and accurately reflects the core feature added across multiple modules. This phrasing enables teammates to quickly understand the primary change at a glance.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/e001-s003-tracing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…tests

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

src/tracing/traceStore.ts (1)
91-92: Consider removing the module anchor.

The __traceStoreModule export is described as an "anchor to keep the file as a module in all bundlers," but this is typically unnecessary in modern TypeScript/JavaScript projects. The presence of export statements (interfaces and class) already makes this a module.

If you're not experiencing specific bundler issues, consider removing this export:
-// Anchor to keep the file as a module in all bundlers.
-export const __traceStoreModule = true;
If you do need it for specific bundler compatibility, please document which bundler requires this and why.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 082b669 and 045258c.

📒 Files selected for processing (3)

src/metrics/metrics.ts (1 hunks)
src/tracing/traceStore.ts (2 hunks)
tests/unit/trace.store.test.ts (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/unit/trace.store.test.ts (1)

src/tracing/traceStore.ts (1)

TraceStore (24-89)

🔇 Additional comments (9)

src/metrics/metrics.ts (1)

58-59: LGTM!

The metrics are correctly registered with the Prometheus registry.

tests/unit/trace.store.test.ts (2)

13-21: LGTM!

The capacity eviction test correctly verifies that when maxTraces=2 and three traces are created, the oldest trace (a) is evicted while b and c remain.

23-29: LGTM!

The TTL expiration test correctly verifies that traces expire after the configured TTL period. The use of async/await with setTimeout is appropriate for this time-based test.
src/tracing/traceStore.ts (6)
1-4: LGTM!

The file header comment clearly describes the purpose of the trace store with TTL and max-size bounds, and references the explainability requirement (S003).

5-22: LGTM!

The interfaces are well-structured with clear comments. The addition of createdAt to the Trace interface and the TraceStoreOptions interface with maxTraces and ttlMs fields properly support TTL and capacity management.

30-34: LGTM!

The constructor properly initializes bounds with sensible defaults (1000 traces, 15 minutes TTL) and guards against invalid values using Math.max(1, ...).

36-43: Verify ID uniqueness under high concurrency.

The ID generation uses Math.random() and Date.now(), which provides reasonable uniqueness for typical use cases. However, under high concurrency (multiple traces created in rapid succession), there's a small risk of collisions since Date.now() has millisecond precision.

Consider whether your trace creation rate could exceed 1000 traces/millisecond. If so, consider adding a sequence counter or using a more robust ID generation library like uuid.
// Alternative with uuid (if needed):
import { v4 as uuidv4 } from 'uuid';

create(): string {
  const id = `tr_${uuidv4()}`;
  // ... rest of implementation
}
51-59: LGTM!

The get() method correctly checks for expiration and deletes expired traces before returning undefined. This lazy expiration approach is efficient and ensures stale traces are cleaned up on access.

61-63: LGTM!

The helper methods (size(), delete(), isExpired()) are simple, correct, and well-implemented.

Also applies to: 80-88

coderabbitai · 2025-10-14T15:28:43Z

src/metrics/metrics.ts

+/** Tracing metrics */
+export const traceCreatedTotal = new client.Counter({
+  name: 'trace_created_total',
+  help: 'Number of traces created'
+});
+export const traceEventsTotal = new client.Counter({
+  name: 'trace_events_total',
+  help: 'Number of trace events recorded'
+});


⚠️ Potential issue | 🔴 Critical

Metrics are defined but never incremented.

The traceCreatedTotal and traceEventsTotal counters are properly defined but are not being incremented anywhere in the codebase. Specifically:

traceCreatedTotal should be incremented in TraceStore.create() (line 36 of traceStore.ts)

traceEventsTotal should be incremented in TraceStore.record() (line 45 of traceStore.ts)

Without these increments, the metrics will always report zero, defeating their purpose.

Apply these changes to increment the metrics:

In src/tracing/traceStore.ts, import the metrics:

+import { traceCreatedTotal, traceEventsTotal } from '../metrics/metrics';

Then increment in create():

create(): string { const id = `tr_${Math.random().toString(36).slice(2)}${Date.now().toString(36)}`; const t: Trace = { id, createdAt: Date.now(), events: [] }; this.traces.set(id, t); this.order.push(id); + traceCreatedTotal.inc(); this.prune(); return id; }

And increment in record():

record(id: string, type: string, data: unknown): void { const t = this.traces.get(id); if (!t) return; t.events.push({ ts: Date.now(), type, data }); + traceEventsTotal.inc(); }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In src/tracing/traceStore.ts around line 36 and line 45, import the counters from src/metrics/metrics.ts at the top of the file, then call traceCreatedTotal.inc() inside TraceStore.create() at ~line 36 to increment the created-trace metric, and call traceEventsTotal.inc() inside TraceStore.record() at ~line 45 to increment the trace-events metric (use .inc(1) or .inc() as appropriate for your metrics client).

coderabbitai · 2025-10-14T15:28:44Z

src/tracing/traceStore.ts

+  /** Remove expired and overflow traces. */
+  prune(): void {
+    // TTL prune
+    for (const id of [...this.order]) {
+      const t = this.traces.get(id);
+      if (!t) continue;
+      if (this.isExpired(t)) this.delete(id);
+    }
+    // Capacity prune (remove oldest)
+    while (this.order.length > this.maxTraces) {
+      const oldest = this.order.shift();
+      if (oldest) this.traces.delete(oldest);
+    }
+  }


⚠️ Potential issue | 🟠 Major

Fix inconsistent deletion logic in capacity pruning.

The prune() method has an inconsistency between TTL pruning and capacity pruning:

TTL pruning (line 71): calls this.delete(id), which removes the trace from both traces map AND order array

Capacity pruning (lines 75-76): only shifts from order and deletes from traces directly, without using the delete() helper

While this currently works, it's inconsistent and error-prone. If delete() logic changes in the future (e.g., adding metrics, logging, or cleanup), capacity pruning would miss those changes.

Apply this diff to use the consistent delete() helper:

// Capacity prune (remove oldest) while (this.order.length > this.maxTraces) { const oldest = this.order.shift(); - if (oldest) this.traces.delete(oldest); + if (oldest) this.delete(oldest); }

Note: You'll also need to adjust delete() to not call splice when the id is no longer in the order array (since shift() already removed it):

private delete(id: string): void { this.traces.delete(id); const idx = this.order.indexOf(id); if (idx >= 0) this.order.splice(idx, 1); }

Or better yet, restructure capacity pruning to not shift first:

// Capacity prune (remove oldest) while (this.order.length > this.maxTraces) { - const oldest = this.order.shift(); - if (oldest) this.delete(oldest); + const oldest = this.order[0]; + if (oldest) this.delete(oldest); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/** Remove expired and overflow traces. */

prune(): void {

// TTL prune

for (const id of [...this.order]) {

const t = this.traces.get(id);

if (!t) continue;

if (this.isExpired(t)) this.delete(id);

}

// Capacity prune (remove oldest)

while (this.order.length > this.maxTraces) {

const oldest = this.order.shift();

if (oldest) this.traces.delete(oldest);

}

}

/** Remove expired and overflow traces. */

prune(): void {

// TTL prune

for (const id of [...this.order]) {

const t = this.traces.get(id);

if (!t) continue;

if (this.isExpired(t)) this.delete(id);

}

// Capacity prune (remove oldest)

while (this.order.length > this.maxTraces) {

const oldest = this.order.shift();

if (oldest) this.delete(oldest);

}

}

🤖 Prompt for AI Agents

In src/tracing/traceStore.ts around lines 65 to 78, capacity pruning is inconsistent with TTL pruning because it shifts the order array and directly deletes from the traces map instead of calling the delete() helper; change the capacity loop to repeatedly read the oldest id without mutating first (e.g., const oldest = this.order[0]; if (oldest) this.delete(oldest); ) so delete() runs the centralized cleanup, and update delete() to tolerate missing ids in the order array (check for index >= 0 before splicing) so calling delete() is safe when the id might already have been removed.

coderabbitai · 2025-10-14T15:28:44Z

tests/unit/trace.store.test.ts

+  it('creates, records, and fetches traces', () => {
+    const store = new TraceStore({ maxTraces: 2, ttlMs: 60_000 });
+    const id = store.create();
+    store.record(id, 'request', { a: 1 });
+    const t = store.get(id);
+    expect(t?.id).toBe(id);
+    expect(t?.events.length).toBe(1);
+  });


⚠️ Potential issue | 🟠 Major

Add verification that metrics are incremented.

Since this PR introduces tracing metrics (traceCreatedTotal and traceEventsTotal), the tests should verify that these metrics are incremented correctly when traces are created and events are recorded.

Add this test to verify metrics:

import { traceCreatedTotal, traceEventsTotal } from '../../src/metrics/metrics'; it('increments metrics on create and record', () => { const store = new TraceStore({ maxTraces: 10, ttlMs: 60_000 }); const beforeCreate = traceCreatedTotal['hashMap'][''].value; const beforeEvents = traceEventsTotal['hashMap'][''].value; const id = store.create(); store.record(id, 'request', { a: 1 }); store.record(id, 'response', { b: 2 }); const afterCreate = traceCreatedTotal['hashMap'][''].value; const afterEvents = traceEventsTotal['hashMap'][''].value; expect(afterCreate).toBe(beforeCreate + 1); expect(afterEvents).toBe(beforeEvents + 2); });

🤖 Prompt for AI Agents

In tests/unit/trace.store.test.ts around lines 4 to 11, the test exercises TraceStore but does not assert the new tracing metrics; add a new test that imports traceCreatedTotal and traceEventsTotal from src/metrics/metrics, captures the current metric values, performs create() and two record() calls, then re-reads the metric values and asserts traceCreatedTotal increased by 1 and traceEventsTotal increased by 2 (use the same internal value access pattern used elsewhere, e.g. reading the metric hashMap value key).

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (13)

src/metrics/metrics.ts (1)
42-50: Add an eviction counter to observe TTL vs capacity removals

To make the new store observable, add a counter with a reason label and register it.
 /** Tracing metrics */
 export const traceCreatedTotal = new client.Counter({
   name: 'trace_created_total',
   help: 'Number of traces created'
 });
 export const traceEventsTotal = new client.Counter({
   name: 'trace_events_total',
   help: 'Number of trace events recorded'
 });
+export const traceEvictionsTotal = new client.Counter({
+  name: 'trace_evictions_total',
+  help: 'Number of traces evicted',
+  labelNames: ['reason'] as const // 'ttl' | 'capacity'
+});

 register.registerMetric(toolsLoaded);
 ...
 register.registerMetric(plannerExecutionLatencyMs);
 register.registerMetric(traceCreatedTotal);
 register.registerMetric(traceEventsTotal);
+register.registerMetric(traceEvictionsTotal);
Also applies to: 58-59
src/tracing/traceStore.ts (5)
30-34: Config semantics: consider allowing “unbounded/disabled” values

Clamping to 1 forbids “no TTL” or “unlimited capacity”. Consider:

maxTraces: 0 → unlimited (Infinity)

ttlMs: 0 or negative → no expiry

This gives you dev/test flexibility and clearer intent.

36-43: Use crypto.randomUUID for IDs

Stronger uniqueness and unpredictability than Math.random + Date.now.
-  create(): string {
-    const id = `tr_${Math.random().toString(36).slice(2)}${Date.now().toString(36)}`;
+  create(): string {
+    const id = `tr_${randomUUID()}`;
     const t: Trace = { id, createdAt: Date.now(), events: [] };
     this.traces.set(id, t);
     this.order.push(id);
     this.prune();
     return id;
   }
Add at top of file:
import { randomUUID } from 'node:crypto';
45-49: Don’t record to expired traces; cheap cleanup

Guard against appending to expired traces to keep invariants and memory bounded between create() calls.
  record(id: string, type: string, data: unknown): void {
-    const t = this.traces.get(id);
-    if (!t) return;
-    t.events.push({ ts: Date.now(), type, data });
+    const t = this.traces.get(id);
+    if (!t) return;
+    if (this.isExpired(t)) {
+      this.delete(id);
+      return;
+    }
+    t.events.push({ ts: Date.now(), type, data });
  }
51-59: Return an immutable copy to avoid external mutation

get() returns internal state by reference; callers can mutate the store. Consider returning a copy.
-    return t;
+    return {
+      id: t.id,
+      createdAt: t.createdAt,
+      events: t.events.map((e) => ({ ...e })),
+    };
65-78: Instrument evictions and prefer delete() for consistency

Increment eviction metrics for TTL vs capacity.

Use delete() in capacity prune to keep invariants consistent.
   prune(): void {
     // TTL prune
     for (const id of [...this.order]) {
       const t = this.traces.get(id);
       if (!t) continue;
-      if (this.isExpired(t)) this.delete(id);
+      if (this.isExpired(t)) {
+        this.delete(id);
+        // traceEvictionsTotal.labels({ reason: 'ttl' }).inc();
+      }
     }
     // Capacity prune (remove oldest)
     while (this.order.length > this.maxTraces) {
       const oldest = this.order.shift();
-      if (oldest) this.traces.delete(oldest);
+      if (oldest) {
+        this.delete(oldest);
+        // traceEvictionsTotal.labels({ reason: 'capacity' }).inc();
+      }
     }
   }
Add (outside these lines) an import of the metric if you adopt instrumentation:
import { traceEvictionsTotal } from '../metrics/metrics';
tests/unit/trace.store.test.ts (1)
23-29: Reduce timing flake in TTL test

Sleep margin is tight (15ms for 10ms TTL). CI jitter can cause flakes. Increase the wait or use fake timers.
-    await new Promise((r) => setTimeout(r, 15));
+    await new Promise((r) => setTimeout(r, 30));
src/server.ts (1)
27-30: Make trace TTL and capacity configurable

Hard‑coding makes ops tuning harder. Read from config/env with sane defaults.
-  const traces = new TraceStore({ maxTraces: 2000, ttlMs: 15 * 60_000 }); // 15 minutes TTL
+  const traces = new TraceStore({
+    maxTraces: Number(process.env.TRACE_MAX_TRACES ?? 2000),
+    ttlMs: Number(process.env.TRACE_TTL_MS ?? 15 * 60_000),
+  }); // 15 minutes default
src/routes/trace.ts (1)
30-33: Broaden event.data schema to accept any JSON value

record() accepts unknown, but response schema only allows objects. Broaden to avoid schema/serialization mismatches.
-                    data: { type: 'object' }
+                    data: {
+                      anyOf: [
+                        { type: 'object' },
+                        { type: 'array' },
+                        { type: 'string' },
+                        { type: 'number' },
+                        { type: 'boolean' },
+                        { type: 'null' }
+                      ]
+                    }
tests/e2e/trace.e2e.test.ts (1)
18-22: Align StubExec signature with interface to avoid config-specific TS errors.

Some tsconfigs enforce strict method compatibility. Add the unused overallAbort parameter.
-class StubExec implements IToolExecutor {
-  async execute(_tool: Tool, _input: JsonRecord): Promise<ExecutionResult> {
+class StubExec implements IToolExecutor {
+  async execute(_tool: Tool, _input: JsonRecord, _overallAbort: AbortSignal): Promise<ExecutionResult> {
     return { status: 'success', latency_ms: 12, output: { ok: true } };
   }
 }
src/planner/planner.ts (3)
34-34: Import TraceStore as a type-only import.

Avoids unnecessary runtime dependency.
-import { TraceStore } from '../tracing/traceStore';
+import type { TraceStore } from '../tracing/traceStore';
72-76: Consider sanitizing the recorded request to reduce PII risk.

Recording the entire ctx may capture sensitive fields in traces. Record only whitelisted keys (capability, execute, timeout_ms) or scrub context/input per policy.

210-214: Tighten rec data type to align with route schema (data is an object).

Route schema declares event.data as an object. Restricting data to JsonRecord helps prevent accidental non-object payloads.
-  /** Record a trace event and increment the trace event counter. */
-  private rec(traceId: string, type: string, data: unknown): void {
+  /** Record a trace event and increment the trace event counter. */
+  private rec(traceId: string, type: string, data: JsonRecord): void {
     this.traces.record(traceId, type, data);
     traceEventsTotal.inc();
   }
If desired, also update TraceStore.record(id, type, data) to accept JsonRecord for full type safety.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 082b669 and c1922e4.

📒 Files selected for processing (8)

src/app.ts (1 hunks)
src/metrics/metrics.ts (1 hunks)
src/planner/planner.ts (2 hunks)
src/routes/trace.ts (1 hunks)
src/server.ts (1 hunks)
src/tracing/traceStore.ts (2 hunks)
tests/e2e/trace.e2e.test.ts (1 hunks)
tests/unit/trace.store.test.ts (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (6)

src/planner/planner.ts (3)

src/planner/contracts.ts (4)

PlanContext (10-23)

PlanResult (40-46)

ExecutionResult (32-37)

JsonRecord (7-7)

src/executors/httpExecutor.ts (1)

execute (9-59)

src/metrics/metrics.ts (6)

traceCreatedTotal (43-46)

plannerBidsTotal (14-18)

plannerSelectionTotal (21-25)

plannerExecutionLatencyMs (35-40)

plannerFallbacksTotal (28-32)

traceEventsTotal (47-50)

tests/unit/trace.store.test.ts (1)

src/tracing/traceStore.ts (1)

TraceStore (24-89)

tests/e2e/trace.e2e.test.ts (4)

src/planner/contracts.ts (3)

IToolExecutor (54-60)

JsonRecord (7-7)

ExecutionResult (32-37)

src/app.ts (1)

buildApp (14-31)

src/tracing/traceStore.ts (1)

TraceStore (24-89)

src/planner/planner.ts (1)

Planner (53-215)

src/routes/trace.ts (1)

src/tracing/traceStore.ts (1)

TraceStore (24-89)

src/server.ts (3)

src/tracing/traceStore.ts (1)

TraceStore (24-89)

src/planner/planner.ts (1)

Planner (53-215)

src/app.ts (1)

buildApp (14-31)

src/app.ts (2)

src/tracing/traceStore.ts (1)

TraceStore (24-89)

src/routes/trace.ts (1)

traceRoutes (6-55)

🪛 GitHub Actions: CI

src/app.ts

[error] 14-14: buildApp signature mismatch: 'traces: TraceStore' parameter is required but not provided in the test invocation.

🔇 Additional comments (4)

src/metrics/metrics.ts (1)

42-50: Tracing counters look good

Counters are well‑named and registered; integrates cleanly with existing registry.

src/app.ts (2)

26-26: Trace route wiring looks correct.

traceRoutes(instance, { traces: deps.traces }) ensures /trace/:id uses the same store as the planner.

8-8: Imports and swagger config

Type-only import for TraceStore is correct; avoids runtime import.

traceRoutes import and swagger info version bump are fine.

Also applies to: 11-11, 20-20

tests/e2e/trace.e2e.test.ts (1)

43-58: E2E flow validates tracing end-to-end.

Good assertions: 200s, matching traceId, and presence of key event types.

coderabbitai · 2025-10-14T15:37:49Z

src/app.ts

+/** Build the Fastify application with schemas, routes, and Swagger UI. */
+export function buildApp(deps: { registry: IRegistryService; planner: IPlanner; traces: TraceStore }): FastifyInstance {
  const app = Fastify({ logger: true });


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Breaking change: buildApp now requires traces; update all call sites (pipeline failing).

Tests/consumers still call buildApp({ registry, planner }), causing CI failure. Pass the same TraceStore instance used by Planner into buildApp to keep routes and planner in sync.

Run to find outdated usages:

🏁 Script executed:

#!/bin/bash # Find buildApp calls and show nearby object literal without 'traces' rg -nP -C3 '\bbuildApp\s*\(\s*\{' \ | awk ' BEGIN { file=""; } /^[-]{2}/ { print; next } /^[^:]+:\d+:/ { split($0, a, ":"); curfile=a[1]; print $0; next } { print } ' \ | rg -nP -C0 '(?s)^.*$' # Display as-is; visually inspect blocks for missing "traces:"

Length of output: 1943

Include traces in buildApp call sites
tests/e2e/tools.e2e.test.ts:29 and tests/e2e/plan.e2e.test.ts:45 still call buildApp({ registry, planner }). Update to pass the same traces instance used when constructing the planner, e.g.:

const traces = new TraceStore(/*…*/); const planner = new Planner(registry, scorer, executor, traces); app = buildApp({ registry, planner, traces });

🧰 Tools

🪛 GitHub Actions: CI

[error] 14-14: buildApp signature mismatch: 'traces: TraceStore' parameter is required but not provided in the test invocation.

🤖 Prompt for AI Agents

In src/app.ts around lines 13 to 15, buildApp now requires a traces parameter but existing E2E test call sites still pass only { registry, planner }; update tests/e2e/tools.e2e.test.ts (line ~29) and tests/e2e/plan.e2e.test.ts (line ~45) to create or reuse the same TraceStore instance used to construct the Planner and pass it into buildApp, e.g. instantiate const traces = new TraceStore(...) before constructing Planner, use that traces when new Planner(..., traces), and call buildApp({ registry, planner, traces }) so the traces instance is consistent across planner and app.

coderabbitai · 2025-10-14T15:37:50Z

src/routes/trace.ts

+export async function traceRoutes(app: FastifyInstance, deps: { traces: TraceStore }): Promise<void> {
+  app.get(
+    '/trace/:id',
+    {
+      schema: {
+        description: 'Fetch a full decision/execution trace by ID',
+        params: {
+          type: 'object',
+          required: ['id'],
+          properties: { id: { type: 'string' } }
+        },
+        response: {
+          200: {
+            type: 'object',
+            required: ['id', 'createdAt', 'events'],
+            properties: {
+              id: { type: 'string' },
+              createdAt: { type: 'number' },
+              events: {
+                type: 'array',
+                items: {
+                  type: 'object',
+                  required: ['ts', 'type'],
+                  properties: {
+                    ts: { type: 'number' },
+                    type: { type: 'string' },
+                    data: { type: 'object' }
+                  },
+                  additionalProperties: false
+                }
+              }
+            },
+            additionalProperties: false
+          },
+          404: {
+            type: 'object',
+            required: ['message'],
+            properties: { message: { type: 'string' } },
+            additionalProperties: false
+          }
+        }
+      }
+    },
+    async (req: FastifyRequest<{ Params: Params }>, reply) => {
+      const t = deps.traces.get(req.params.id);
+      if (!t) return reply.code(404).send({ message: 'Trace not found or expired' });
+      return reply.code(200).send(t);
+    }
+  );
+}


🛠️ Refactor suggestion | 🟠 Major

Secure the trace endpoint (auth/gating + no-store + redaction)

Trace payloads often include request inputs and errors; exposing them unauthenticated risks PII/secret leakage. At minimum:

Gate behind a feature flag or require auth (API key/JWT).

Add Cache-Control: no-store on responses.

Consider redacting sensitive fields before returning.

- async (req: FastifyRequest<{ Params: Params }>, reply) => { + async (req: FastifyRequest<{ Params: Params }>, reply) => { + // Optional: enforce a simple guard; replace with your auth as needed. + // if (process.env.TRACE_ROUTE_ENABLED !== '1') return reply.code(404).send({ message: 'Not found' }); const t = deps.traces.get(req.params.id); - if (!t) return reply.code(404).send({ message: 'Trace not found or expired' }); - return reply.code(200).send(t); + if (!t) return reply.code(404).send({ message: 'Trace not found or expired' }); + reply.header('Cache-Control', 'no-store'); + return reply.code(200).send(t); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export async function traceRoutes(app: FastifyInstance, deps: { traces: TraceStore }): Promise<void> {

app.get(

'/trace/:id',

{

schema: {

description: 'Fetch a full decision/execution trace by ID',

params: {

type: 'object',

required: ['id'],

properties: { id: { type: 'string' } }

},

response: {

200: {

type: 'object',

required: ['id', 'createdAt', 'events'],

properties: {

id: { type: 'string' },

createdAt: { type: 'number' },

events: {

type: 'array',

items: {

type: 'object',

required: ['ts', 'type'],

properties: {

ts: { type: 'number' },

type: { type: 'string' },

data: { type: 'object' }

},

additionalProperties: false

}

}

},

additionalProperties: false

},

404: {

type: 'object',

required: ['message'],

properties: { message: { type: 'string' } },

additionalProperties: false

}

}

}

},

async (req: FastifyRequest<{ Params: Params }>, reply) => {

const t = deps.traces.get(req.params.id);

if (!t) return reply.code(404).send({ message: 'Trace not found or expired' });

return reply.code(200).send(t);

}

);

}

export async function traceRoutes(app: FastifyInstance, deps: { traces: TraceStore }): Promise<void> {

app.get(

'/trace/:id',

{

schema: {

description: 'Fetch a full decision/execution trace by ID',

params: {

type: 'object',

required: ['id'],

properties: { id: { type: 'string' } }

},

response: {

200: {

type: 'object',

required: ['id', 'createdAt', 'events'],

properties: {

id: { type: 'string' },

createdAt: { type: 'number' },

events: {

type: 'array',

items: {

type: 'object',

required: ['ts', 'type'],

properties: {

ts: { type: 'number' },

type: { type: 'string' },

data: { type: 'object' }

},

additionalProperties: false

}

}

},

additionalProperties: false

},

404: {

type: 'object',

required: ['message'],

properties: { message: { type: 'string' } },

additionalProperties: false

}

}

}

},

async (req: FastifyRequest<{ Params: Params }>, reply) => {

// Optional: enforce a simple guard; replace with your auth as needed.

// if (process.env.TRACE_ROUTE_ENABLED !== '1') return reply.code(404).send({ message: 'Not found' });

const t = deps.traces.get(req.params.id);

if (!t) return reply.code(404).send({ message: 'Trace not found or expired' });

reply.header('Cache-Control', 'no-store');

return reply.code(200).send(t);

}

);

}

feat(e001-s003): trace store with TTL/capacity and tracing metrics

045258c

feat(e001-s003): GET /trace/:id with planner event recording and e2e …

c1922e4

…tests

coderabbitai bot reviewed Oct 14, 2025

View reviewed changes

Dumidu1212 merged commit a729cc3 into main Oct 14, 2025
1 of 3 checks passed

coderabbitai bot mentioned this pull request Oct 14, 2025

feat(e001-s003): tracing store, /trace/:id route, planner events, met… #9

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(e001-s003): trace store with TTL/capacity and tracing metrics#8

feat(e001-s003): trace store with TTL/capacity and tracing metrics#8
Dumidu1212 merged 2 commits intomainfrom
feat/e001-s003-tracing

Dumidu1212 commented Oct 14, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 14, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 14, 2025

Uh oh!

coderabbitai bot Oct 14, 2025

Uh oh!

coderabbitai bot Oct 14, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 14, 2025

Uh oh!

coderabbitai bot Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Dumidu1212 commented Oct 14, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Dumidu1212 commented Oct 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 14, 2025 •

edited

Loading