This document outlines the development approach for the otel-query-server project, designed for iterative development with frequent checkpointing and open-source collaboration. Updated to incorporate the emerging MCP OpenTelemetry Trace Support proposal.
Our otel-query-server sits at the intersection of two complementary initiatives:
- Core Mission: Provide MCP tools for querying observability data from various backends
- MCP OTel Integration: Support the proposed
notifications/otel/tracecapability for enhanced observability
The MCP OTel Tracing Proposal introduces mechanisms for MCP servers to emit trace data back to clients. Our server can both consume this capability (by implementing the OTel notification support) and provide it (by emitting traces of its own query operations).
otel-query-server/
├── src/
│ ├── otel_query_server/
│ │ ├── __init__.py
│ │ ├── server.py # FastMCP server with OTel capability
│ │ ├── config.py # Configuration + OTel settings
│ │ ├── models.py # Pydantic models + OTel trace models
│ │ ├── otel/ # NEW: OpenTelemetry integration
│ │ │ ├── __init__.py
│ │ │ ├── tracer.py # OTel tracer setup
│ │ │ ├── notifications.py # OTel notification handling
│ │ │ └── correlation.py # Trace correlation utilities
│ │ ├── tools/ # MCP tool implementations
│ │ │ ├── __init__.py
│ │ │ ├── search_traces.py # Enhanced with OTel emission
│ │ │ ├── search_logs.py
│ │ │ ├── query_metrics.py
│ │ │ ├── get_service_health.py
│ │ │ └── correlate_trace_logs.py
│ │ └── drivers/ # Backend integrations
│ │ ├── __init__.py
│ │ ├── base.py # Base driver with OTel instrumentation
│ │ ├── otel_collector.py
│ │ ├── grafana.py
│ │ ├── elastic_cloud.py
│ │ └── opensearch.py
├── tests/
│ ├── unit/
│ ├── integration/
│ ├── otel/ # NEW: OTel-specific tests
│ └── fixtures/
├── examples/
│ ├── config.yaml # Updated with OTel config
│ ├── otel-config.yaml # NEW: OTel-specific config
│ └── docker-compose.yaml
├── docs/
│ ├── api.md
│ ├── otel-integration.md # NEW: OTel integration docs
│ └── troubleshooting.md
├── Dockerfile
├── pyproject.toml # Updated with OTel dependencies
└── README.md
Goal: Establish basic project structure and prepare for OTel integration
mkdir -p otel-query-server/{src/otel_query_server/{tools,drivers,otel},tests/{unit,integration,otel,fixtures},examples,docs}
cd otel-query-server
git init
git add .
git commit -m "feat: initial project structure with OTel integration support"git add pyproject.toml
git commit -m "feat: add dependencies including opentelemetry-api, opentelemetry-sdk"Key additions to pyproject.toml:
[project]
dependencies = [
"fastmcp>=1.0.0",
"opentelemetry-api>=1.20.0",
"opentelemetry-sdk>=1.20.0",
"opentelemetry-exporter-otlp>=1.20.0",
"opentelemetry-instrumentation>=0.41b0",
# ... existing dependencies
]git add src/otel_query_server/models.py
git commit -m "feat: add Pydantic models with OTel ResourceSpans support"git add src/otel_query_server/config.py examples/config.yaml
git commit -m "feat: implement configuration with OTel settings and traceToken support"git add src/otel_query_server/otel/tracer.py
git commit -m "feat: implement OpenTelemetry tracer setup and configuration"git add src/otel_query_server/drivers/base.py
git commit -m "feat: create base driver with automatic OTel span creation"git add src/otel_query_server/server.py
git commit -m "feat: implement FastMCP server with otel.traces capability"Goal: Complete OpenTelemetry integration before tool implementation
git add src/otel_query_server/otel/notifications.py
git commit -m "feat: implement notifications/otel/trace emission"git add src/otel_query_server/otel/correlation.py
git commit -m "feat: add traceToken correlation and span stitching"git add examples/otel-config.yaml
git commit -m "feat: add OTel-specific configuration examples"git add tests/otel/test_tracer.py tests/otel/test_notifications.py
git commit -m "test: add comprehensive OTel integration tests"git add src/otel_query_server/server.py
git commit -m "feat: implement MCP server capability declaration for otel.traces"git add src/otel_query_server/otel/middleware.py
git commit -m "feat: add OTel middleware for automatic span creation per MCP request"git add docs/otel-integration.md
git commit -m "docs: add comprehensive OTel integration documentation"git add tests/integration/test_otel_integration.py
git commit -m "test: add integration tests for OTel notification emission"Goal: Implement one complete tool with full OTel integration
git add src/otel_query_server/drivers/otel_collector.py
git commit -m "feat: implement OTEL collector driver with automatic span emission"git add src/otel_query_server/tools/search_traces.py
git commit -m "feat: implement search_traces with traceToken handling and span emission"git add src/otel_query_server/cache.py
git commit -m "feat: add LRU cache with OTel span tracking"git add tests/unit/test_search_traces.py
git commit -m "test: add unit tests for search_traces with OTel validation"git add tests/integration/test_otel_e2e.py
git commit -m "test: add end-to-end OTel trace emission and correlation tests"Goal: Implement all backend drivers with consistent OTel integration
git commit -m "feat: implement Grafana driver with OTel span emission"
git commit -m "feat: extend search_traces to support Grafana with OTel"
git commit -m "test: add OTel-aware tests for Grafana driver"git commit -m "feat: implement Elastic Cloud driver with OTel instrumentation"
git commit -m "feat: extend search_traces to support Elastic Cloud with OTel"
git commit -m "test: add OTel-aware tests for Elastic Cloud driver"git commit -m "feat: implement OpenSearch driver with OTel spans"
git commit -m "feat: extend search_traces to support OpenSearch with OTel"
git commit -m "test: add OTel-aware tests for OpenSearch driver"
git commit -m "test: add cross-backend OTel correlation tests"# In server.py
capabilities = {
"otel": {
"traces": True # Declares support for notifications/otel/trace
},
"tools": {...},
"resources": {...}
}# In tools/search_traces.py
async def search_traces(
service: str,
time_range: str,
_meta: Optional[Dict[str, Any]] = None
) -> List[Trace]:
"""Search traces with OTel emission support."""
# Extract traceToken if provided
trace_token = _meta.get("traceToken") if _meta else None
# Create parent span
with tracer.start_as_current_span(
"search_traces",
attributes={
"mcp.tool": "search_traces",
"mcp.service": service,
"mcp.time_range": time_range
}
) as span:
# Perform search with child spans
results = await _perform_search(service, time_range)
# Emit OTel notification if traceToken provided
if trace_token:
await emit_otel_notification(
trace_token=trace_token,
spans=get_current_spans()
)
return results# examples/otel-config.yaml
otel:
traces:
enabled: true
emit_to_client: true # Emit via notifications/otel/trace
export_to_collector: false # Don't duplicate to external collector
service_name: "otel-query-server"
attributes:
deployment.environment: "production"
service.version: "1.0.0"
instrumentation:
auto_instrument: true
include_sql_queries: false # Security consideration
include_http_headers: false # Security consideration
max_span_attributes: 100Following the GitHub discussion concerns:
- Span Sanitization: Implement configurable span attribute filtering
- Sensitive Data: Never include API keys, SQL queries, or PII in spans
- Optional Emission: OTel notification emission is opt-in via traceToken
- Conditional Tracing: Only create detailed spans when traceToken is provided
- Async Emission: OTel notifications sent asynchronously to avoid blocking
- Span Limiting: Configurable limits on span count and attribute size
- Backward Compatibility: Non-OTel clients work unchanged
- Progressive Enhancement: OTel features enhance but don't break basic functionality
- Standards Compliance: Follow OpenTelemetry semantic conventions
# tests/otel/test_notifications.py
async def test_otel_notification_emission():
"""Test that OTel notifications are emitted correctly."""
# Mock MCP client expecting OTel notifications
mock_client = Mock()
# Call tool with traceToken
await search_traces(
service="test-service",
time_range="1h",
_meta={"traceToken": "test-token-123"}
)
# Verify notification was sent
mock_client.send_notification.assert_called_once()
notification = mock_client.send_notification.call_args[^0][^0]
assert notification["method"] == "notifications/otel/trace"
assert notification["params"]["traceToken"] == "test-token-123"
assert "resourceSpans" in notification["params"]# tests/integration/test_otel_e2e.py
async def test_end_to_end_otel_correlation():
"""Test complete OTel trace correlation flow."""
# Start with client span
with tracer.start_as_current_span("client_operation") as client_span:
# Call MCP tool with traceToken
trace_token = f"client-{client_span.get_span_context().trace_id}"
results = await search_traces(
service="test-service",
time_range="1h",
_meta={"traceToken": trace_token}
)
# Verify correlation
assert len(results) > 0
# Additional correlation assertions...- Basic MCP server with 5 tools
- Backend drivers for all 4 systems
- Standard MCP capabilities
- OpenTelemetry trace emission
- MCP OTel capability declaration
- TraceToken correlation support
- Security-focused span sanitization
- Metrics emission (if MCP spec supports)
- Enhanced correlation capabilities
- Performance optimizations
- Enterprise security features
- Track the GitHub discussion closely
- Implement according to emerging spec
- Contribute back to the specification process
- Maintain compatibility with reference implementations
- Follow OTel semantic conventions
- Use standard span attributes
- Implement proper context propagation
- Maintain compatibility with OTel ecosystem
This updated approach ensures our otel-query-server not only provides excellent observability querying capabilities but also serves as a reference implementation for the emerging MCP OpenTelemetry integration standards.