-
Notifications
You must be signed in to change notification settings - Fork 343
Description
HyperDX GitHub Issue: Service Map Span Kind Expression not working in WHERE clause
Issue Summary
Title: Service Map: Span Kind Expression not applied in WHERE clause for span kind filtering
Labels: service-map, beta, expression-support
Preliminary Note
This issue may be resolved once ClickStack Cloud endpoint becomes available. ClickStack is designed to handle OpenTelemetry data ingestion with proper schema transformations. However, until ClickStack Cloud endpoint is generally available, users relying on direct OTEL Collector → ClickHouse integration encounter this issue.
Description
The Service Map feature (beta, released November 2025) does not properly apply the Span Kind Expression when filtering spans in the WHERE clause. This causes Service Maps to fail with "No services found" when using OpenTelemetry standard span kind values (SPAN_KIND_CLIENT, SPAN_KIND_SERVER).
Environment
- HyperDX Version: ClickHouse Cloud hosted (https://hyperdx.clickhouse.cloud/)
- ClickHouse Version: ClickHouse Cloud
- Data Source: OpenTelemetry Collector (otel/opentelemetry-collector-contrib:0.93.0) with ClickHouse exporter
Steps to Reproduce
-
Ingest OpenTelemetry traces using OTEL Collector with ClickHouse exporter
- SpanKind values stored in database:
SPAN_KIND_CLIENT,SPAN_KIND_SERVER,SPAN_KIND_INTERNAL(OpenTelemetry standard format)
- SpanKind values stored in database:
-
Create a Trace Source in HyperDX with the following configuration:
- Span Kind Expression:
replaceAll(SpanKind, 'SPAN_KIND_', '') - Other standard expressions (TraceId, SpanId, ServiceName, etc.)
- Span Kind Expression:
-
Navigate to Service Map page
-
Select the configured Trace Source
-
Set time range to include data
Expected Behavior
Service Map should display services and their connections because:
-
The
Span Kind Expressionshould transform SpanKind values to the format HyperDX expects:SPAN_KIND_CLIENT→Client(title case)SPAN_KIND_SERVER→Server(title case)SPAN_KIND_INTERNAL→Internal(title case)
-
Service Map should then use these transformed values to find Client-Server relationships
Note: Testing revealed that HyperDX expects title case (Client, Server) not uppercase (CLIENT, SERVER). The expression replaceAll(SpanKind, 'SPAN_KIND_', '') produces uppercase, which doesn't match HyperDX's expectations even if expressions were properly applied.
Actual Behavior
Service Map displays:
No services found. The Service Map shows links between services with related Client- and Server-kind spans.
Evidence
1. Data exists with CLIENT-SERVER relationships
Direct ClickHouse query using the expression transformation works correctly:
WITH service_calls AS (
SELECT
client.ServiceName as source_service,
server.ServiceName as target_service,
COUNT(*) as call_count
FROM otel_traces client
INNER JOIN otel_traces server
ON client.TraceId = server.TraceId
AND client.SpanId = server.ParentSpanId
WHERE replaceAll(client.SpanKind, 'SPAN_KIND_', '') = 'CLIENT'
AND replaceAll(server.SpanKind, 'SPAN_KIND_', '') = 'SERVER'
GROUP BY source_service, target_service
)
SELECT * FROM service_calls
ORDER BY call_count DESC;Result: Multiple service-to-service relationships found, proving the data exists.
2. Browser DevTools error
When inspecting network requests in Service Map, HyperDX returns errors suggesting expressions are not being applied in WHERE clauses.
Root Cause Analysis
Verified Issues
-
Primary Issue: Span Kind Expression not applied in WHERE clauses
- Configured expressions work in direct ClickHouse queries
- Same expressions fail in HyperDX Service Map
- Evidence suggests WHERE clauses use literal values instead of applying configured expressions
-
Secondary Issue: Case sensitivity requirement
- Testing revealed HyperDX expects title case (
Client,Server,Internal) - OpenTelemetry standard produces:
SPAN_KIND_CLIENT,SPAN_KIND_SERVER - Simple
replaceAll(SpanKind, 'SPAN_KIND_', '')produces uppercase:CLIENT,SERVER - This mismatch means users need CASE expressions, not simple replaceAll
- Testing revealed HyperDX expects title case (
Expected SQL Pattern
-- What should work but doesn't:
WHERE <configured_span_kind_expression> = 'Client'
AND <configured_span_kind_expression> = 'Server'
-- Example with proper expression and case handling:
WHERE CASE
WHEN SpanKind = 'SPAN_KIND_CLIENT' THEN 'Client'
WHEN SpanKind = 'SPAN_KIND_SERVER' THEN 'Server'
ELSE SpanKind
END = 'Client'Proposed Solution
Apply the configured Span Kind Expression in WHERE clauses when filtering for CLIENT and SERVER spans in Service Map queries.
Why Expression Support Matters
Proper Expression support would eliminate the need for workarounds:
- Storage: Eliminates ~2x storage overhead from duplicate tables
- Simplicity: No manual materialized view creation or maintenance
- Consistency: Expression fields work as documented in the UI
- Compatibility: Native support for OpenTelemetry standard format (
SPAN_KIND_*)
Suggested Implementation
-
Apply expressions consistently in all SQL clauses:
- ✅ SELECT clauses (currently working)
⚠️ WHERE clauses (not working)- JOIN, GROUP BY clauses (if applicable)
-
Consider case-insensitive SpanKind comparison or document the title case requirement
Workaround
✅ Verified Working Solution: Materialized View with Proper Case Transformation
Status: Successfully tested and confirmed working with HyperDX Service Map.
Important Discovery: HyperDX Service Map expects SpanKind values with title case (Client, Server, Internal), not uppercase (CLIENT, SERVER, INTERNAL).
Implementation
-- 1. Create target table
CREATE TABLE o11y.otel_traces_conv
(
Timestamp DateTime64(9),
TraceId String,
SpanId String,
ParentSpanId String,
TraceState String,
SpanName LowCardinality(String),
SpanKind LowCardinality(String),
ServiceName LowCardinality(String),
ResourceAttributes Map(LowCardinality(String), String),
ScopeName String,
ScopeVersion String,
SpanAttributes Map(LowCardinality(String), String),
Duration UInt64,
StatusCode LowCardinality(String),
StatusMessage String,
`Events.Timestamp` Array(DateTime64(9)),
`Events.Name` Array(LowCardinality(String)),
`Events.Attributes` Array(Map(LowCardinality(String), String)),
`Links.TraceId` Array(String),
`Links.SpanId` Array(String),
`Links.TraceState` Array(String),
`Links.Attributes` Array(Map(LowCardinality(String), String))
)
ENGINE = MergeTree()
ORDER BY (ServiceName, Timestamp);
-- 2. Create materialized view with proper case transformation
CREATE MATERIALIZED VIEW o11y.otel_traces_conv_mv
TO o11y.otel_traces_conv
AS SELECT
Timestamp,
TraceId,
SpanId,
ParentSpanId,
TraceState,
SpanName,
CASE
WHEN SpanKind = 'SPAN_KIND_CLIENT' THEN 'Client'
WHEN SpanKind = 'SPAN_KIND_SERVER' THEN 'Server'
WHEN SpanKind = 'SPAN_KIND_INTERNAL' THEN 'Internal'
WHEN SpanKind = 'SPAN_KIND_PRODUCER' THEN 'Producer'
WHEN SpanKind = 'SPAN_KIND_CONSUMER' THEN 'Consumer'
ELSE SpanKind
END AS SpanKind,
ServiceName,
ResourceAttributes,
ScopeName,
ScopeVersion,
SpanAttributes,
Duration,
StatusCode,
StatusMessage,
`Events.Timestamp`,
`Events.Name`,
`Events.Attributes`,
`Links.TraceId`,
`Links.SpanId`,
`Links.TraceState`,
`Links.Attributes`
FROM o11y.otel_traces;
-- 3. Populate with existing data
INSERT INTO o11y.otel_traces_conv
SELECT * FROM o11y.otel_traces_conv_mv;HyperDX Configuration for Workaround
Table: otel_traces_conv (the converted table)
Span Kind Expression: SpanKind (no transformation needed!)
Test Results
✅ Service Map Successfully Displays: Multiple microservice connections with call counts and latency metrics
Key Learnings
- Case Sensitivity: HyperDX requires title case (
Client/Server/Internal) - Expression Not Applied: Span Kind Expression is ignored in WHERE clauses
- Workaround Effective: Pre-transforming data via materialized view bypasses the expression issue
- Storage Cost: Requires ~2x storage (original + transformed table)
Alternative Workarounds
Use ClickHouse queries directly:
- Bypass HyperDX UI and query ClickHouse directly for Service Map data
- Apply transformations in SQL queries manually
Additional Context
- Service Map is a beta feature (released November 2025)
- This issue affects any OpenTelemetry Collector setup using the ClickHouse exporter
- The ClickHouse exporter always uses OpenTelemetry standard format (
SPAN_KIND_*) - Other HyperDX features (App Trace, Search) work correctly with the same data
Code Investigation
Bug Location Confirmed
File: packages/app/src/hooks/useServiceMap.tsx
Function: getServiceMapQuery
The bug exists in the Service Map query construction code:
async function getServiceMapQuery({
source,
dateRange,
traceId,
metadata,
samplingFactor,
}: {
source: TSource;
dateRange: [Date, Date];
traceId?: string;
metadata: Metadata;
samplingFactor: number;
}) {
// ... other code ...
const [serverCTE, clientCTE] = await Promise.all([
renderChartConfig(
{
...baseCTEConfig,
filters: [
...baseCTEConfig.filters,
{
type: 'sql',
condition: `${source.spanKindExpression} IN ('Server', 'Consumer')`, // ⚠️ Expression not applied
},
],
where: '',
},
metadata,
),
renderChartConfig(
{
...baseCTEConfig,
filters: [
...baseCTEConfig.filters,
{
type: 'sql',
condition: `${source.spanKindExpression} IN ('Client', 'Producer')`, // ⚠️ Expression not applied
},
],
where: '',
},
metadata,
),
]);The Problem
When source.spanKindExpression contains a ClickHouse expression like replaceAll(SpanKind, 'SPAN_KIND_', ''), the code correctly inserts it into the SQL string template:
condition: `${source.spanKindExpression} IN ('Server', 'Consumer')`However, somewhere in the query rendering pipeline (likely in renderChartConfig), the expression is not being properly applied or is being replaced with just the column name.
Expected Behavior
The generated SQL should be:
WHERE replaceAll(SpanKind, 'SPAN_KIND_', '') IN ('Server', 'Consumer')Actual Behavior
Testing suggests the expression transformation is not applied in WHERE clauses, causing queries to match no rows when SpanKind values are in OpenTelemetry format (SPAN_KIND_SERVER, SPAN_KIND_CONSUMER).
Analysis
The renderChartConfig function (or its downstream processing) may not be preserving the full expression when building WHERE clauses.
Sample Data
OpenTelemetry Standard SpanKind Format
When using OTEL Collector with ClickHouse exporter, SpanKind values follow the OpenTelemetry specification:
SPAN_KIND_CLIENT
SPAN_KIND_SERVER
SPAN_KIND_INTERNAL
SPAN_KIND_PRODUCER
SPAN_KIND_CONSUMER
HyperDX Expected Format
Testing revealed Service Map expects title case:
Client
Server
Internal
Producer
Consumer
Impact
Current Impact
This issue prevents users from using Service Map with standard OpenTelemetry data collection pipelines. Since the OTEL Collector's ClickHouse exporter is the recommended way to send traces to ClickHouse, this affects most users attempting to use Service Map with OTEL data.
Workaround Status
✅ Workaround Available: Materialized view solution successfully tested and documented.
However, the workaround:
- Requires ~2x storage (original + transformed tables)
- Adds operational complexity (manual MV creation and maintenance)
- Needs ClickHouse SQL expertise
- Doesn't scale well for large deployments
Impact Summary
- Workaround requires ~2x storage and manual maintenance
- Affects standard OpenTelemetry Collector setups
- Expression fields in UI don't work as expected
Priority
Medium-High - Service Map requires workaround for standard OpenTelemetry setups. Expression support would improve user experience and eliminate storage overhead.