V4 to V5 signoz migration by shagun-singh-inkeep · Pull Request #2774 · inkeep/agents

shagun-singh-inkeep · 2026-03-19T15:09:21Z

No description provided.

vercel · 2026-03-19T15:09:29Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Mar 19, 2026 10:43pm
agents-docs	Ready	Preview, Comment	Mar 19, 2026 10:43pm
agents-manage-ui	Ready	Preview, Comment	Mar 19, 2026 10:43pm

changeset-bot · 2026-03-19T15:09:29Z

🦋 Changeset detected

Latest commit: 74b0622

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages

Name	Type
@inkeep/agents-core	Patch
@inkeep/agents-manage-ui	Patch
@inkeep/agents-api	Patch
@inkeep/agents-cli	Patch
@inkeep/agents-sdk	Patch
@inkeep/agents-work-apps	Patch
@inkeep/ai-sdk-provider	Patch
@inkeep/create-agents	Patch
@inkeep/agents-email	Patch
@inkeep/agents-mcp	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

pullfrog · 2026-03-19T15:11:48Z

TL;DR — Migrates all SigNoz integration from the v4 API (/api/v4/query_range with builderQueries object maps) to the v5 API (/api/v5/query_range with expression-based filters and queries arrays). This is a net-negative ~1,600 lines change that replaces verbose filter-object trees with concise string expressions, merges related queries to reduce round-trips, and adds parallel batch execution for conversation detail loads.

Key changes

signoz-queries.ts — Replaces v4 constants (DATA_TYPES, FIELD_TYPES, QUERY_FIELD_CONFIGS, PANEL_TYPES, DATA_SOURCES, AGGREGATE_OPERATORS) with v5 equivalents (REQUEST_TYPES, QUERY_TYPES, FIELD_CONTEXTS, FIELD_DATA_TYPES, SIGNALS) and adds a buildFilterExpression() helper for converting operator/value pairs to v5 filter strings.
signoz.ts (API proxy) — Extracts shared helpers (getSignozConfig, authorizeProject, handleSignozError, extractResults), switches the endpoint to /api/v5/query_range, and rewrites query-batch to parse v5 columnar responses and narrow step-2 time ranges from step-1 timestamps.
signozHelpers.ts — Rewrites enforceSecurityFilters from v4 filter-object injection to v5 expression string concatenation, reducing the function from 70 lines to 24.
conversations/[conversationId]/route.ts — Merges 15+ individual builderQueries into 3 parallel batches (core, context, events), consolidates separate query names (e.g. contextResolution/contextHandle → contextResolutionAndHandle, aiGenerations/aiStreamingText → aiLlmCalls, three toolApproval* → toolApprovals), adds memoized ancestor lookups, and includes per-batch timing logs.
traces/route.ts — Updates the validation schema from {queryType, panelType, builderQueries} to {queries[], requestType} to match the v5 payload format.
spans/[spanId]/route.ts — Converts the ClickHouse SQL query envelope and response parsing from v4 series[0].labels to v5 columnar columns + data format.
conversation-list-item.tsx — Adds a startTime > 0 guard to prevent rendering invalid dates from zero-valued timestamps.

_{Summary ｜ 9 files ｜ 5 commits ｜ base: main ← v5}

`signoz-queries.ts` — v4-to-v5 constant vocabulary

Before: v4 vocabulary (DATA_TYPES, FIELD_TYPES, QUERY_FIELD_CONFIGS, PANEL_TYPES, DATA_SOURCES, AGGREGATE_OPERATORS) with isColumn/isJSON metadata objects used to describe every field.
After: v5 vocabulary (REQUEST_TYPES, QUERY_TYPES, FIELD_CONTEXTS, FIELD_DATA_TYPES, SIGNALS) plus a buildFilterExpression() utility that converts filter items to SQL-like expression strings.

The v4 API required structured objects with key, dataType, type, isColumn, and isJSON for every filter item. v5 uses simple string expressions (key = 'value'), eliminating 60+ lines of config objects. Several query expression names are consolidated:

v4	v5
`contextResolution` + `contextHandle`	`contextResolutionAndHandle`
`aiGenerations` + `aiStreamingText`	`aiLlmCalls`
`toolApprovalRequested/Approved/Denied`	`toolApprovals`

signoz-queries.ts

`signoz.ts` — Proxy route refactor and v5 endpoint

Before: Inline auth checks, config reads, and error handling duplicated across /query, /query-batch, and /health handlers; endpoint pointed at /api/v4/query_range.
After: Shared getSignozConfig(), authorizeProject(), handleSignozError(), and extractResults() helpers; endpoint is /api/v5/query_range; query-batch parses columnar v5 responses and narrows step-2 time windows using min/max timestamps from step-1 results.

The injectConversationIdFilter function (which built structured filter objects) is replaced by inline expression string concatenation. The query-batch step-2 optimization reads columns + data arrays instead of series[].labels to extract conversation IDs, and computes a tighter time range from aggregation column values.

How does the step-2 time range narrowing work?
After step 1 returns paginated conversation IDs with timestamps, the code scans the aggregation column to find the min and max timestamps, then sets `detailPayloadTemplate.start = minTs - 1h` and `end = maxTs + 1h`. This avoids querying the full original time range for the detail step.

signoz.ts · signozHelpers.ts

`conversations/[conversationId]/route.ts` — Parallel batched queries and v5 payloads

Before: A single monolithic SigNoz payload with 15+ builderQueries entries, each using verbose filter objects; sequential execution; no timing instrumentation.
After: 3 parallel payloads (core, context, events) using buildQueryEnvelope() with expression-based filters; parseListByName() to split merged query results by name field; memoized findAncestorActivity / findAncestorAgentGeneration with depth guards; per-batch and total timing logs.

This is the largest change in the PR. The buildConversationListPayload function (returning a single payload) is replaced by buildConversationPayloads (returning 3 payloads executed via Promise.all). Field selection uses a compact sf(name, type, context) helper instead of 5-line objects. Post-query filtering via parseListByName handles the merged queries (e.g., splitting contextResolutionAndHandle rows by span name). The ancestor-lookup functions now use Map-based caches and depth limits to prevent stack overflows on deep span trees.

[conversationId]/route.ts · traces/route.ts

`spans/[spanId]/route.ts` — v5 ClickHouse SQL response format

Before: Response parsed as json.data.result[0].series[0].labels (v4 series-based format); payload used step, chQueries map, queryType: 'clickhouse_sql', panelType: 'table'.
After: Response parsed as json.data.data.results[0].columns + data (v5 columnar format); payload uses requestType: 'scalar', queries array with {type, spec} envelopes, typed variables with {type: 'custom', value}.

spans/[spanId]/route.ts

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ pullfrog.com ｜ 𝕏}

pullfrog · 2026-03-19T15:14:07Z

TL;DR — Migrates the entire SigNoz observability integration from v4 to v5 API. The v5 format replaces structured JSON filter objects with string-based filter expressions, swaps builderQueries maps for queries[] arrays, and returns columnar response data instead of series/labels. Also splits the monolithic conversation-detail query into 3 parallel batches, adds resilient retry logic for missing SigNoz attributes, and includes performance optimizations (memoized ancestor lookups, time-range narrowing).

Key changes

/api/v5/query_range — All SigNoz proxy and client-side query endpoints target the v5 API instead of v4.
String-based filter expressions — Filters switch from structured { key, dataType, type, op, value } JSON objects to SQL-like string expressions (e.g., serviceName IN ('...') AND tenant.id = '...').
compositeQuery.queries[] array — Replaces the named builderQueries map with [{ type, spec }] envelopes carrying new fields (signal, selectFields, fieldContext).
Columnar response parsing — Response handling switches from data.result[].list[].data to data.results[].rows[] / columns[] + data[][].
Parallel query batching — The conversation-detail endpoint splits queries into 3 payloads (core, context, events) fired via Promise.all.
Query merging — Similar span types (contextResolution/contextHandle, tool approvals, AI generations/streaming) are merged into single queries with client-side parseListByName() splitting by span name.
queryWithRetry() for missing attributes — New retry logic detects SigNoz 400 errors caused by key not found attributes, strips those queries, and retries — gracefully handles attributes not yet ingested.
enforceSecurityFilters() rewritten — Appends AND serviceName IN (...) AND tenant.id = '...' strings instead of mutating structured filter item arrays. Now also rejects non-builder query types to prevent tenant isolation bypass via raw SQL.
/span-lookup endpoint — New server-side route replaces client-side ClickHouse SQL construction for span detail — the query is built with buildSpanLookupPayload() which bakes tenant filtering into the SQL WHERE clause.
Performance optimizations — Map-based memoization caches with depth limits for findAncestorActivity()/findAncestorAgentGeneration(), findLast() replaces [...arr].reverse().find(), multiple .filter().length calls consolidated into a single loop.
totalErrors bug fix — Was hardcoded to 0; now reflects the actual count of error-status activities.
Shared constants overhauled — Removes v4 constructs (QUERY_FIELD_CONFIGS, DATA_TYPES, FIELD_TYPES, PANEL_TYPES, etc.) and adds v5 equivalents (REQUEST_TYPES, QUERY_TYPES, FIELD_CONTEXTS, SIGNALS, buildFilterExpression()). OPERATORS expanded with full coverage (CONTAINS, NOT_CONTAINS, REGEX, NOT_REGEX, NOT_IN, LESS_THAN, GREATER_THAN).
Proxy code refactored — getSignozConfig(), handleSignozError(), authorizeProject(), extractResults() extracted as shared helpers to eliminate duplication across /query, /query-batch, /span-lookup, and /health.

_{Summary ｜ 9 files ｜ 10 commits ｜ base: main ← v5}

SigNoz v4 → v5 query format migration

Before: Queries used compositeQuery.builderQueries — a named map of structured objects with filters.items[] arrays, selectColumns[] with { key, dataType, type, isColumn }, and panelType/queryType at the top level.
After: Queries use compositeQuery.queries[] — an array of { type, spec } envelopes where spec carries signal, filter.expression (string-based), selectFields[] with { name, fieldDataType, fieldContext }, and top-level requestType.

The v5 format is more concise — the sf(name, dataType, fieldContext) helper replaces verbose QUERY_FIELD_CONFIGS presets. Security enforcement now appends AND clauses to expression strings rather than deep-cloning payloads and splicing structured filter items. Non-builder query types are explicitly rejected to prevent tenant isolation bypass via raw SQL.

How does the new buildFilterExpression() work?

A general-purpose utility exported from agents-core that converts { key, op, value } items to string expressions. Handles IN, NOT IN, EXISTS, NOT EXISTS, LIKE, CONTAINS, REGEX, and comparison operators with proper quoting. The OP_MAP translates shorthand operators (e.g., nin → NOT IN, nregex → NOT REGEX) into their SQL-like v5 equivalents.

signoz.ts · signozHelpers.ts · signoz-queries.ts

`queryWithRetry()` — resilient handling of missing SigNoz attributes

Before: Any SigNoz 400 error was returned to the client as a generic "Invalid query parameters" response.
After: queryWithRetry() detects key not found errors, identifies which queries reference the missing keys, strips them, and retries. Queries referencing only known attributes succeed even when new attributes haven't been ingested yet.

This prevents the entire query from failing when a newly-instrumented span attribute hasn't been indexed by SigNoz. getMissingKeys() parses the error response, and queryReferencesKeys() scans filter expressions and select fields to determine which queries to remove.

signoz.ts

Parallel query batching and query merging for conversation detail

Before: A single payload with ~16 builderQueries was sent to SigNoz for conversation detail; similar span types each had their own query.
After: Queries are split into 3 logical batches (core, context, events) sent via Promise.all. Similar span types — contextResolution/contextHandle, all tool approvals, aiGenerations/aiStreamingText — are merged into single queries with client-side parseListByName() splitting by SPAN_KEYS.NAME.

This reduces the total query count and enables server-side parallelism. The handler also narrows the step-2 time range to min(timestamp) - 1h .. max(timestamp) + 1h from step-1 results, and now logs timing breakdowns (signozMs, processingMs, totalMs).

[conversationId]/route.ts · signoz-queries.ts

`/span-lookup` endpoint and columnar response parsing

Before: The span-detail client route constructed a raw ClickHouse SQL payload directly and sent it through the generic /query proxy. Responses were parsed from result[0].series[0].labels.
After: A dedicated /span-lookup server-side route builds the SQL payload via buildSpanLookupPayload() with tenant filtering baked into the WHERE clause. Responses are parsed from columnar columns[] + data[][] format using index-based lookups via data.results[].

Moving SQL construction server-side ensures the client never sends arbitrary ClickHouse SQL through the proxy (which now rejects non-builder queries). The extractResults() helper centralizes result extraction from the v5 response envelope.

spans/[spanId]/route.ts · signozHelpers.ts · signoz.ts

Proxy code refactoring and performance optimizations

Before: Each proxy route (/query, /query-batch, /health) duplicated auth, config lookup, and error handling logic. Ancestor lookups used linear scans with unbounded recursion. totalErrors was hardcoded to 0.
After: Shared helpers getSignozConfig(), authorizeProject(), handleSignozError(), and extractResults() eliminate duplication. Ancestor functions use Map-based lookup with memoization caches and a depth limit of 200. totalErrors reflects finalErrorCount. Multiple .filter().length calls replaced with a single loop.

Memoization caches use ?? null instead of the non-null assertion operator for type-safe Map.get() returns.

signoz.ts · [conversationId]/route.ts · conversation-list-item.tsx

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ pullfrog.com ｜ 𝕏}

pullfrog

Solid migration from SigNoz v4 to v5 — good refactoring (consolidated queries, parallel batching, memoized ancestor lookups). Two high-severity items need attention: an expression injection vector in signoz.ts:174 and a null-safety gap in enforceSecurityFilters. Lower-severity items noted inline.

^{｜ Fix all ➔ ｜ Fix 👍s ➔ ｜ View workflow run ｜ pullfrog.com ｜ 𝕏}

agents-api/src/domains/manage/routes/signoz.ts

pullfrog · 2026-03-19T15:14:23Z

agents-api/src/utils/signozHelpers.ts

+    const securityExpr = buildSecurityExpression(tenantId, projectId);
+    for (const { type, spec } of payload.compositeQuery.queries) {
+      if (type !== QUERY_TYPES.BUILDER_QUERY) continue;
+      spec.filter = { expression: `(${spec.filter.expression}) AND ${securityExpr}` };


High: Null-safety gap. If any query arrives without spec.filter or spec.filter.expression (e.g. a new query type or a client-side bug), this line will throw Cannot read properties of undefined (reading 'expression'). The old v4 code was resilient to missing filters — it would create them from scratch.

Suggestion: default to an empty expression:

const existing = spec.filter?.expression ?? 'true'; spec.filter = { expression: `(${existing}) AND ${securityExpr}` };

agents-api/src/domains/manage/routes/signoz.ts

+
+    const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => `'${id}'`).join(', ')})`;
+    for (const { spec } of detailPayloadTemplate.compositeQuery.queries) {
+      spec.filter = { expression: `(${spec.filter.expression}) AND ${convIdExpr}` };


agents-api/src/domains/manage/routes/signoz.ts

pullfrog · 2026-03-19T15:14:24Z

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts

-    logger.debug({ responseData }, 'SigNoz response (truncated)');
-    return json;
+    const json = response.data;
+    const results = json?.data?.data?.results ?? json?.data?.results ?? [];


Same concern as extractResults. The dual fallback json?.data?.data?.results ?? json?.data?.results suggests uncertainty about the response shape. If the v5 shape is known, pin to one path and drop the fallback — the silent fallback makes debugging harder when the response shape changes.

pullfrog · 2026-03-19T15:14:24Z

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts

+const float64 = FIELD_DATA_TYPES.FLOAT64;
+const bool = FIELD_DATA_TYPES.BOOL;
+
+function buildBaseExpression(conversationId: string, projectId?: string): string {


Same expression injection pattern as the server-side route. conversationId comes from URL path params (line 461) and projectId from query params (line 469). Both are user-controlled and interpolated into filter expressions without escaping. A value like test' OR 1=1 -- would inject arbitrary filter logic.

This is lower risk since the payload goes through enforceSecurityFilters on the server side (which scopes to the correct tenant), but it still allows a user to broaden their own query scope within their tenant — e.g. fetch data for a different conversation.

pullfrog · 2026-03-19T15:14:24Z

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts

+    const payloads = buildConversationPayloads(conversationId, start, end, projectId);
+    const batchLabels = ['core', 'context', 'events'] as const;
+
+    const batchResults = await Promise.all(


Good improvement. Splitting the single massive query into three parallel batches (core, context, events) with Promise.all is a meaningful performance win — the SigNoz backend can process them concurrently. The timing logs are also helpful for diagnosing production latency.

claude

PR Review Summary

(8) Total Issues | Risk: High

🔴❗ Critical (2) ❗🔴

Inline Comments:

🔴 Critical: signozHelpers.ts:6 SQL injection via unescaped tenantId/projectId interpolation
🔴 Critical: signoz.ts:174 SQL injection via unescaped conversation IDs from query results

🟠⚠️ Major (3) 🟠⚠️

🟠 1) signozHelpers.ts:19 Security filters not applied to non-builder query types

Issue: The enforceSecurityFilters function only applies tenant/project filters to queries where type === QUERY_TYPES.BUILDER_QUERY. If a ClickHouse SQL query or other query type is passed through, it will bypass tenant isolation entirely.

Why: The spans/[spanId]/route.ts uses clickhouse_sql queries which would not be filtered. This creates a potential tenant isolation bypass if the proxy endpoint accepts non-builder queries.

Fix: Either:

Reject non-builder query types in the proxy endpoint with a 400 error
Implement security filtering for clickhouse_sql queries by injecting WHERE clauses
Document that only builder queries are supported and validate the query type before proxying

Refs:

Inline Comments:

🟠 Major: signozHelpers.ts:20 Null pointer when spec.filter is undefined
🟠 Major: signoz-queries.ts:52 Incorrect SQL escaping (backslash vs standard SQL '')

🟡 Minor (3) 🟡

🟡 1) signoz.ts:57-80 Missing 429 rate limit handling in SigNoz proxy

Issue: The handleSignozError function does not check for HTTP 429 responses from SigNoz. When SigNoz is rate-limiting requests, the proxy returns a generic 500 error instead of propagating the 429 status and Retry-After header.

Why: This masks the actual issue from clients and prevents them from implementing proper backoff. The conversation route already handles 429 (lines 116-117), but the agents-api proxy does not.

Fix: Add 429 handling in handleSignozError:

if (error.response?.status === 429) {
  logger.warn({ status: 429 }, 'SigNoz rate limit exceeded');
  return { 
    body: { error: 'Too Many Requests', message: 'SigNoz rate limit exceeded', retryAfter: error.response.headers['retry-after'] }, 
    status: 429 as const 
  };
}

Refs:

conversation route 429 handling

🟡 2) signoz-queries.ts:76-86 OPERATORS constant missing NOT_IN

Issue: The internal OP_MAP at line 48 includes 'nin' -> 'NOT IN' mapping, but the exported OPERATORS constant does not include NOT_IN. The old v4 OPERATORS had NOT_IN: 'nin'.

Why: Consumer code that previously used OPERATORS.NOT_IN will break after this migration.

Fix: Add NOT_IN: 'nin' to the OPERATORS constant for parity with OP_MAP.

Refs:

OP_MAP includes nin

Inline Comments:

🟡 Minor: conversations/[conversationId]/route.ts:164 Hardcoded 'builder_query', 'desc', 60, false instead of constants

💭 Consider (1) 💭

💭 1) signoz-stats.ts (multiple methods) Silent fallback to empty data on API errors

Issue: Multiple methods in SigNozStatsAPI catch all errors and return empty arrays/default values without surfacing the error condition to callers. Users see empty dashboards without knowing whether it's due to no data or a service failure.

Why: This is a pre-existing pattern, but worth addressing for better user experience. The pattern appears in 15+ methods throughout the file.

Fix: Consider returning a discriminated union: { status: 'success' | 'error', data?: T, error?: Error } to distinguish between "no data" and "service error".

Refs:

Pattern appears throughout signoz-stats.ts

🧹 While You're Here (1) 🧹

🧹 1) signoz-links.ts Still uses v4 SigNoz query format

Issue: This PR migrates SigNoz queries from v4 to v5 format, but signoz-links.ts was not touched and still uses the old v4 compositeQuery structure with queryType/builder/dataSource pattern.

Why: This creates split-world inconsistency where future maintainers may not know which pattern to follow.

Fix: Either migrate signoz-links.ts to v5 format in this PR, or create a tracking issue documenting this as intentional legacy code with a migration plan.

Refs:

signoz-links.ts

🚫 REQUEST CHANGES

Summary: The v4→v5 SigNoz migration is well-structured with significant code simplification (-1600 lines net). However, there are critical SQL injection vulnerabilities in the new string-interpolation-based filter building that must be addressed before merge. The tenantId, projectId, and conversationIds values are interpolated directly into SQL filter expressions without escaping single quotes, creating a potential security and data integrity risk. Additionally, the security filter enforcement skips non-builder query types, which could allow tenant isolation bypass.

Discarded (5)

Location	Issue	Reason Discarded
`signoz-stats.ts:160`	Global axios retry configuration may interfere with other HTTP clients	Pre-existing pattern, out of scope for this PR
`conversations/[conversationId]/route.ts:116`	Rate limit handling does not respect Retry-After header	Already handles 429 correctly, forwarding the header is a nice-to-have not a bug
`spans/[spanId]/route.ts:127`	Generic error message loses diagnostic context	Pre-existing pattern, not introduced by this PR
`signoz-stats.ts:127`	Filter value parsing error uses console.warn but continues	Defensive pattern, appropriate behavior
`conversations/[conversationId]/route.ts:807`	Empty catch blocks for JSON parsing	Appropriate defensive handling for optional data

Reviewers (4)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Discarded
`pr-review-standards`	4	1	0	0	3	0
`pr-review-sre`	7	1	0	0	1	3
`pr-review-errors`	4	0	1	0	0	2
`pr-review-consistency`	5	1	0	1	1	0
Total	20	3	1	1	5	5

claude · 2026-03-19T15:16:02Z

packages/agents-core/src/constants/signoz-queries.ts

+};
+
+function quoteValue(value: unknown): string {
+  if (typeof value === 'string') return `'${value.replace(/'/g, "\\'")}'`;


🟠 MAJOR Incorrect SQL escaping: backslash instead of standard SQL

Issue: The quoteValue function escapes single quotes by replacing ' with \' (backslash-escape). Standard SQL uses '' (doubled single quotes). ClickHouse supports both syntaxes, but backslash escaping can behave unexpectedly depending on settings.

Why: Using non-standard escaping may cause issues in edge cases and is inconsistent with the escaping recommended for ClickHouse.

Fix:

Suggested change

if (typeof value === 'string') return `'${value.replace(/'/g, "\\'")}'`;

if (typeof value === 'string') return `'${value.replace(/'/g, "''")}'`;

Refs:

ClickHouse SQL String Literal Syntax

claude · 2026-03-19T15:16:15Z

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts

+  limit = 10000
+): any {
+  return {
+    type: 'builder_query',


🟡 Minor Hardcoded string instead of QUERY_TYPES.BUILDER_QUERY constant

Issue: This file uses the hardcoded string 'builder_query' while signoz-stats.ts consistently uses QUERY_TYPES.BUILDER_QUERY. The same applies to 'desc' (line 170), 60 (line 172), and false (line 173) which have corresponding constants.

Why: Using magic strings instead of shared constants creates maintenance burden and risks typos. The peer file signoz-stats.ts consistently uses QUERY_TYPES.BUILDER_QUERY, ORDER_DIRECTIONS.DESC, QUERY_DEFAULTS.STEP_INTERVAL, and QUERY_DEFAULTS.DISABLED.

Fix: Import and use the constants:

import { QUERY_TYPES, ORDER_DIRECTIONS, QUERY_DEFAULTS } from '@/constants/signoz'; // ... return { type: QUERY_TYPES.BUILDER_QUERY, spec: { // ... order: [{ key: { name: SPAN_KEYS.TIMESTAMP }, direction: ORDER_DIRECTIONS.DESC }], stepInterval: QUERY_DEFAULTS.STEP_INTERVAL, disabled: QUERY_DEFAULTS.DISABLED, }, };

Refs:

signoz-stats.ts:1060 — example of consistent constant usage

itoqa · 2026-03-19T16:38:34Z

Ito Test Report ✅

6 test cases ran. 6 passed.

✅ The included trace workflow checks passed: filter mode switching behaved as expected, timestamp badge suppression was validated, date clamping worked, invalid conversation handling remained controlled, and rapid interaction flows converged to stable UI states.

✅ Passed (6)

Test Case	Summary	Timestamp
ROUTE-2	SigNoz configuration warning banner was shown clearly and the traces page remained usable, matching unconfigured-health behavior.	0:00
LOGIC-2	Confirmed request URL switched from mode=batch to non-batch after entering search text; API corroboration returned controlled 400 response without crash.	11:03
LOGIC-3	Using mocked batch API responses due local traces backend 503s, the overview rendered a relative-time badge for a valid startTime row and correctly suppressed the badge for the row with startTime=0.	18:30
EDGE-1	Verified custom calendar prevented future end-date selection and applying the range did not trigger hard validation failure.	11:03
EDGE-4	Invalid conversation route displayed a controlled error state and Back to Overview recovery returned to traces overview successfully.	2:35
ADV-3	After aggressive local API mocking fallback, rapid pagination plus repeated Errors-only and agent-filter toggles converged to a stable final overview state without duplicate rows or persistent spinner.	21:23

📋 View Recording

Screen Recording

claude

PR Review Summary

(4) Total Issues | Risk: Medium

Re-review scope: This review covers the delta since the last automated review (3 commits, 4 files). Prior findings are referenced in Pending Recommendations.

✅ Issues Addressed in This Delta

Two critical issues from the prior review have been fixed:

SQL escaping fixed — quoteValue() now uses standard SQL '' escaping instead of backslash \'
Non-builder query bypass fixed — enforceSecurityFilters() now throws on non-builder query types, preventing tenant isolation bypass via raw SQL

🟠⚠️ Major (1) 🟠⚠️

Inline Comments:

🟠 Major: signoz.ts:72-80 queryReferencesKeys doesn't check orderBy fields

🟡 Minor (2) 🟡

Inline Comments:

🟡 Minor: signoz.ts:108 Silent empty response when all queries stripped
🟡 Minor: signoz.ts:291 Magic number for 180-day lookback period

🕐 Pending Recommendations (3)

These issues were raised in the prior review and remain unaddressed:

🔴 signozHelpers.ts:8 SQL injection via unescaped tenantId/projectId interpolation in buildSecurityExpression
🔴 signoz.ts:262 SQL injection via unescaped conversation IDs from query results
🟠 signozHelpers.ts:28-30 Null pointer exception when spec.filter is undefined

🚫 REQUEST CHANGES

Summary: Good progress on this delta — the SQL escaping and non-builder query bypass issues are now fixed. However, the critical SQL injection vulnerabilities flagged in the prior review (unescaped tenantId/projectId and conversationIds interpolation) remain unaddressed. These should be fixed before merging to ensure tenant data isolation.

Quick wins:

For buildSecurityExpression: escape tenantId and projectId with the same quoteValue() helper that was just fixed, or use parameterized queries
For conversation IDs at line 262: conversationIds.map((id) => quoteValue(id)).join(', ') (import quoteValue from signoz-queries.ts)
For spec.filter null check: if (!spec?.filter?.expression) throw new Error('Query must have filter.expression');

Discarded (4)

Location	Issue	Reason Discarded
`signoz.ts:85-117`	Retry logic lacks backoff delay	Since the retry modifies the payload (strips queries), immediate retry is reasonable — it's not retrying the same request
`signoz.ts:275-310`	Span-lookup doesn't use queryWithRetry	ClickHouse SQL queries have different semantics; the retry logic is designed for builder queries with multiple sub-queries
`signoz.ts:103-106`	`retried` flag not used at call sites	The logging at line 103-106 already provides the key information; adding more logging is marginal value
`signoz.ts:111-115`	Retry failure logging missing	The outer `handleSignozError` provides adequate error context

Reviewers (4)

Reviewer	Returned	Inline Comments	Discarded
`pr-review-security-iam`	0	0	0
`pr-review-sre`	6	2	2
`pr-review-errors`	3	1	2
`pr-review-standards`	3	1	0
Total	12	3	4

Note: Security reviewer returned 0 findings for delta — prior critical issues are pre-existing and tracked in Pending Recommendations.

claude · 2026-03-19T22:29:39Z

agents-api/src/domains/manage/routes/signoz.ts

+function queryReferencesKeys(query: any, keys: string[]): boolean {
+  const spec = query?.spec;
+  const searchable = [
+    spec?.filter?.expression ?? '',
+    ...(spec?.selectFields ?? []).map((f: any) => f.name),
+    ...(spec?.groupBy ?? []).map((g: any) => g.name),
+  ];
+  return keys.some((k) => searchable.some((t: string) => t.includes(k)));
+}


🟠 MAJOR queryReferencesKeys doesn't check orderBy fields

Issue: The function only checks filter.expression, selectFields, and groupBy for missing key references. If a missing attribute is referenced only in orderBy, the query won't be filtered out and the retry will fail with the same error.

Why: SigNoz builder queries can reference attributes in order clauses. If the missing key error originates from an orderBy reference, the retry logic will fail to identify and remove the problematic query, causing an unnecessary second failure.

Fix: Add orderBy to the searchable fields:

const searchable = [ spec?.filter?.expression ?? '', ...(spec?.selectFields ?? []).map((f: any) => f.name), ...(spec?.groupBy ?? []).map((g: any) => g.name), ...(spec?.order ?? []).map((o: any) => o.key?.name ?? o.columnName ?? ''), ];

Refs:

SigNoz v5 builder query spec

claude · 2026-03-19T22:29:45Z

agents-api/src/domains/manage/routes/signoz.ts

+      'Retrying SigNoz query without queries referencing missing keys'
+    );
+
+    if (kept.length === 0) return { data: EMPTY_RESPONSE, retried: true };


🟡 Minor Silent empty response when all queries stripped

Issue: When all queries reference missing keys, the function returns EMPTY_RESPONSE silently. The retried: true flag is returned but never surfaced to users. Clients receive empty results with no indication their query attributes aren't available.

Why: Users may assume "no data exists" rather than "query attributes not ingested yet." This makes debugging observability issues difficult during SigNoz schema migrations.

Fix: Log at a higher visibility level when returning empty due to stripped queries:

if (kept.length === 0) { logger.warn( { missingKeys: missing, originalQueryCount: queries.length }, 'All queries stripped due to missing attributes - returning empty response' ); return { data: EMPTY_RESPONSE, retried: true }; }

Refs:

Line 103-106 already logs successful partial strips; this extends to full strip case

claude · 2026-03-19T22:29:50Z

agents-api/src/domains/manage/routes/signoz.ts

+    return c.json({ error: 'Service Unavailable', message: 'SigNoz is not configured' }, 500);
+
+  const now = Date.now();
+  const lookbackMs = 180 * 24 * 60 * 60 * 1000;


🟡 Minor Magic number for lookback period

Issue: The 180-day lookback period is hardcoded without explanation. This wide time range may affect query performance on large ClickHouse tables.

Why: Without documentation, operators can't tune this value during performance issues. The 180-day window seems generous for a span lookup.

Fix: Extract to a named constant:

Suggested change

const lookbackMs = 180 * 24 * 60 * 60 * 1000;

const SPAN_LOOKUP_LOOKBACK_MS = 180 * 24 * 60 * 60 * 1000; // 180 days: covers full trace retention period

claude

PR Review Summary

(0) New Issues | (3) Pending from Prior Review | Risk: High

Re-review scope: This review covers the delta since the last automated review (1 commit, 3 files, +11/-3 lines).

✅ Delta Assessment

The latest commit 74b0622d6 ("comments and fixes") contains no new issues. All changes are improvements:

signoz-queries.ts — Added missing operators to OPERATORS constant (NOT_IN, REGEX, NOT_REGEX, LESS_THAN, GREATER_THAN, CONTAINS, NOT_CONTAINS). This restores parity with the internal OP_MAP and fixes a breaking change flagged in the prior review.
conversations/[conversationId]/route.ts:1285 — Fixed null-safety in cache lookup:
```
- if (agentGenCache.has(activityId)) return agentGenCache.get(activityId)!;
+ if (agentGenCache.has(activityId)) return agentGenCache.get(activityId) ?? null;
```
The non-null assertion (!) was unsafe since the cache stores null values. Now correctly returns null when the cached value is null.
spans/[spanId]/route.ts:51 — Simplified response path:
```
- const results = json?.data?.data?.results ?? json?.data?.results;
+ const results = json?.data?.results ?? [];
```
Removes uncertain dual-path fallback in favor of the correct v5 response shape.

🕐 Pending Recommendations (3)

These critical and major issues were raised in the prior review and remain unaddressed:

🔴 signozHelpers.ts:8 — SQL injection via unescaped tenantId/projectId interpolation in buildSecurityExpression. Values are interpolated directly into filter expressions without escaping single quotes.
🔴 signoz.ts:262 — SQL injection via unescaped conversation IDs. The conversationIds array from step-1 results is interpolated directly: conversationIds.map((id) => \'${id}'`).join(', ')` without escaping.
🟠 signozHelpers.ts:29 — Null pointer exception when spec.filter is undefined. Line 29 accesses spec.filter.expression without null check.

Quick fixes for the SQL injection issues:

// signozHelpers.ts:8 — use quoteValue from signoz-queries.ts
import { quoteValue } from '@inkeep/agents-core';
let expr = `${SERVICE_NAME_FILTER} AND ${SPAN_KEYS.TENANT_ID} = ${quoteValue(tenantId)}`;
if (projectId) expr += ` AND ${SPAN_KEYS.PROJECT_ID} = ${quoteValue(projectId)}`;

// signoz.ts:262 — escape conversation IDs
const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => quoteValue(id)).join(', ')})`;

// signozHelpers.ts:29 — add null check
const existing = spec.filter?.expression ?? 'true';
spec.filter = { expression: `(${existing}) AND ${securityExpr}` };

🚫 REQUEST CHANGES

Summary: The delta itself is clean — all 3 changes are improvements with no new issues introduced. However, the critical SQL injection vulnerabilities flagged in the prior review remain unaddressed in the unchanged files. These must be fixed before merge to ensure tenant data isolation. The fix is straightforward: import and use the quoteValue() helper (which now correctly uses standard SQL '' escaping) for all user-controlled string interpolation.

Discarded (0)

No findings discarded in this delta review.

Reviewers (1)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
`orchestrator`	3	0	0	0	0	3	0
Total	3	0	0	0	0	3	0

Note: Delta-scoped review — sub-reviewers not dispatched for minimal changes. Prior critical issues tracked in Pending Recommendations.

itoqa · 2026-03-20T00:03:19Z

Ito Test Report ❌

20 test cases ran. 1 failed, 19 passed.

Across 8 report shards covering 20 test cases, 19 passed and the traces platform was largely stable: core overview/detail routes, pagination/search fallback, span lookup side pane, analytics pages (AI/Tool Calls), auth and tenant/project scoping protections, rapid-interaction stress behavior, mobile usability, and multiple API validation guardrails all behaved as expected without crashes or data leakage. The single significant finding was a Medium-severity defect in GET /api/traces/conversations/[conversationId], where a whitespace-only conversationId (%20) is not trimmed/rejected and is incorrectly forwarded into downstream time-range/query logic (likely introduced by the PR), weakening endpoint contract reliability and adding avoidable operational noise.

❌ Failed (1)

Category	Summary	Screenshot
Edge	🟠 Conversation details route accepts a whitespace-only `conversationId` path value and continues processing instead of rejecting it.

🟠 Conversation details endpoint accepts whitespace-only conversation IDs

What failed: The route only rejects falsy conversationId values, so a whitespace path segment (%20) is treated as valid and forwarded into time-range lookup and trace query construction instead of returning a parameter-validation error.
Impact: Invalid path input can trigger unnecessary downstream requests and inconsistent error behavior instead of a clean client-facing validation response. This weakens endpoint contract reliability for callers and increases operational noise.
Introduced by this PR: Yes – this PR modified the relevant code
Steps to reproduce:
1. Call GET /api/traces/conversations/%20?tenantId=default&projectId= with valid authentication.
2. Observe that the request is processed through time-range lookup and downstream query logic instead of being rejected immediately.
3. Confirm the route does not return a direct invalid-parameter response for the whitespace-only path ID.
Code analysis: I reviewed the conversation details route and its time-range helper. The handler checks only if (!conversationId), then immediately uses the raw value in both bounds lookup and query filter expression construction; no trim/sanitization step exists before downstream execution.
Why this is likely a bug: The production route accepts syntactically invalid whitespace-only IDs due to missing normalization/validation, and code paths clearly show the invalid value is consumed as if it were a real identifier.

Relevant code:

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts (lines 461-464)

const { conversationId } = await context.params;
if (!conversationId) {
  return NextResponse.json({ error: 'Conversation ID is required' }, { status: 400 });
}

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts (lines 151-155)

function buildBaseExpression(conversationId: string, projectId?: string): string {
  const parts = [`${SPAN_KEYS.CONVERSATION_ID} = '${conversationId}'`];
  if (projectId) parts.push(`${SPAN_KEYS.PROJECT_ID} = '${projectId}'`);
  return parts.join(' AND ');
}

agents-manage-ui/src/lib/api/signoz-conversation-time-range.ts (lines 32-35)

const bounds = await makeManagementApiRequest<{
  data: { createdAt: string; updatedAt: string };
}>(`tenants/${tenantId}/projects/${projectId}/conversations/${conversationId}/bounds`);

✅ Passed (19)

Category	Summary	Screenshot
Adversarial	Generic traces proxy rejected non-builder query type (`clickhouse_sql`) with explicit HTTP 400 error.
Adversarial	Injection-like span filter payload did not crash the UI or expose out-of-scope data in the validated scenario.
Adversarial	Unauthenticated traces access is blocked; invalid POST payload is rejected by schema validation and valid payload is denied with auth error.
Adversarial	Tenant/project tampering attempts were denied or returned safe scoped errors with no foreign trace payload exposure.
Adversarial	Rapid toggles, search, and time-range interactions converged to a stable final traces state with no trace-request or trace-console errors after environment remediation.
Adversarial	Mobile traces overview and conversation detail remained usable at 390x844 under this run's dev fallback setup.
Edge	Inverted and future time-range requests were rejected with explicit HTTP 400 validation errors.
Edge	Batch mode rejected malformed payloads missing required sections with deterministic HTTP 400 responses.
Edge	Span details endpoint correctly required `conversationId` and returned HTTP 400 when omitted.
Edge	Conversation row with zero startTime rendered safely and omitted relative-time chip without exceptions.
Edge	Deep-link refresh and back-forward navigation returned to a stable Conversation Details state without error.
Happy-path	GET /api/traces?tenantId=default returned HTTP 200 with structured status/configured fields and no stack trace exposure.
Happy-path	Traces overview rendered aggregate cards and conversation list container for the project without runtime crash.
Happy-path	Pagination advanced and returned correctly with stable conversation IDs and no duplication in the validated flow.
Happy-path	Search flow used fallback behavior and clearing search restored the expected list state.
Happy-path	Deep-link conversation details view rendered summary cards and timeline for the target conversation route.
Happy-path	Selecting a timeline activity loaded span details and displayed structured side-pane content.
Happy-path	AI Calls page remained functional with deterministic traces responses, including agent filter interaction and rendered cards/charts.
Happy-path	Tool Calls page rendered correctly with deterministic traces responses, including date-range change and stable breakdown cards.

Commit: 74b0622

View Full Run

Tell us how we did: Give Ito Feedback

shagun-singh-inkeep added 5 commits March 18, 2026 15:40

v5

e126d60

additional

936fbef

updates

9a78087

clean up

aad2362

changeset

13a4b0d

vercel bot deployed to Preview – agents-docs March 19, 2026 15:09 View deployment

style: auto-format with biome

a53b25d

vercel bot temporarily deployed to Preview – agents-docs March 19, 2026 15:10 Inactive

vercel bot deployed to Preview – agents-api March 19, 2026 15:12 View deployment

vercel bot deployed to Preview – agents-manage-ui March 19, 2026 15:13 View deployment

pullfrog bot reviewed Mar 19, 2026

View reviewed changes

claude bot reviewed Mar 19, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 19, 2026

inkeep deleted a comment from claude bot Mar 19, 2026

shagun-singh-inkeep added 3 commits March 19, 2026 18:25

retry

720bcea

more updates

8dd0d5f

more changes

47b28f3

vercel bot deployed to Preview – agents-manage-ui March 19, 2026 22:27 View deployment

vercel bot deployed to Preview – agents-api March 19, 2026 22:27 View deployment

vercel bot deployed to Preview – agents-docs March 19, 2026 22:27 View deployment

claude bot reviewed Mar 19, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 19, 2026

comments and fixes

74b0622

vercel bot deployed to Preview – agents-manage-ui March 19, 2026 22:41 View deployment

vercel bot deployed to Preview – agents-docs March 19, 2026 22:41 View deployment

claude bot reviewed Mar 19, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 19, 2026

vercel bot deployed to Preview – agents-api March 19, 2026 22:43 View deployment

	if (typeof value === 'string') return `'${value.replace(/'/g, "\\'")}'`;
	if (typeof value === 'string') return `'${value.replace(/'/g, "''")}'`;

	const lookbackMs = 180 * 24 * 60 * 60 * 1000;
	const SPAN_LOOKUP_LOOKBACK_MS = 180 * 24 * 60 * 60 * 1000; // 180 days: covers full trace retention period

Conversation

shagun-singh-inkeep commented Mar 19, 2026

Uh oh!

vercel bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

pullfrog bot commented Mar 19, 2026

Key changes

signoz-queries.ts — v4-to-v5 constant vocabulary

signoz.ts — Proxy route refactor and v5 endpoint

conversations/[conversationId]/route.ts — Parallel batched queries and v5 payloads

spans/[spanId]/route.ts — v5 ClickHouse SQL response format

Uh oh!

pullfrog bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key changes

SigNoz v4 → v5 query format migration

queryWithRetry() — resilient handling of missing SigNoz attributes

Parallel query batching and query merging for conversation detail

/span-lookup endpoint and columnar response parsing

Proxy code refactoring and performance optimizations

Uh oh!

pullfrog bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🔴❗ Critical (2) ❗🔴

🟠⚠️ Major (3) 🟠⚠️

🟡 Minor (3) 🟡

💭 Consider (1) 💭

🧹 While You're Here (1) 🧹

🚫 REQUEST CHANGES

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

itoqa bot commented Mar 19, 2026

Ito Test Report ✅

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

✅ Issues Addressed in This Delta

🟠⚠️ Major (1) 🟠⚠️

🟡 Minor (2) 🟡

🕐 Pending Recommendations (3)

🚫 REQUEST CHANGES

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 19, 2026

vercel bot commented Mar 19, 2026 •

edited

Loading

changeset-bot bot commented Mar 19, 2026 •

edited

Loading

`signoz-queries.ts` — v4-to-v5 constant vocabulary

`signoz.ts` — Proxy route refactor and v5 endpoint

`conversations/[conversationId]/route.ts` — Parallel batched queries and v5 payloads

`spans/[spanId]/route.ts` — v5 ClickHouse SQL response format

pullfrog bot commented Mar 19, 2026 •

edited

Loading

`queryWithRetry()` — resilient handling of missing SigNoz attributes

`/span-lookup` endpoint and columnar response parsing