OpenTelemetry Instrumentation Not Working for Apache Spark Structured Streaming Consumers

### Is your feature request related to a problem? Please describe.

## 📋 Issue Summary

Instrumentation is currently not working for Apache Spark Structured Streaming consumers, particularly when processing Kafka messages with trace context in headers. This creates a significant observability gap for distributed tracing in Spark-based streaming applications.

https://spark.apache.org/docs/latest/streaming/index.html#:~:text=Structured%20Streaming%20is%20a%20scalable%20and%20fault-tolerant%20stream,would%20express%20a%20batch%20computation%20on%20static%20data.

## 🔍 Problem Description

### Current Behavior
- Spark Structured Streaming consumers cannot properly propagate OpenTelemetry trace context
- Trace context from Kafka message headers is not automatically extracted and propagated
- No built-in OpenTelemetry instrumentation exists for Apache Spark in streaming contexts
- Manual trace context extraction and span creation is required but not straightforward

### Impact
- **Observability Gap**: Loss of distributed tracing across Spark streaming pipelines
- **Debugging Difficulty**: Inability to trace message flow through Spark transformations
- **Performance Monitoring**: Missing insights into processing latency and bottlenecks
- **Compliance Issues**: Difficulty meeting observability requirements in production environments

## 🛠️ Root Cause Analysis

### Technical Challenges
1. **Header Access Complexity**: Kafka headers containing trace context require manual parsing and extraction
2. **Span Lifecycle Management**: Manual span creation, linking, and cleanup in streaming contexts
3. **Resource Management**: Proper OpenTelemetry SDK initialization and cleanup in Spark executor environments

### Current Limitations
- No native OpenTelemetry support in Apache Spark Structured Streaming
- Manual trace context extraction from Kafka headers required
- Complex span linking and parent-child relationship management
- No automatic instrumentation for Spark transformations


### Describe the solution you'd like

Hoping for a solution without manual propagation. 

Hoping for a solution where users can just use `./spark-submit [...] --javaagent` and propagation will work. (And this is achievable!)

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

### Tip

<sub>[React](https://github.blog/news-insights/product-news/add-reactions-to-pull-requests-issues-and-comments/) with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding `+1` or `me too`, to help us triage it. Learn more [here](https://opentelemetry.io/community/end-user/issue-participation/).</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpenTelemetry Instrumentation Not Working for Apache Spark Structured Streaming Consumers #14372

Is your feature request related to a problem? Please describe.

📋 Issue Summary

🔍 Problem Description

Current Behavior

Impact

🛠️ Root Cause Analysis

Technical Challenges

Current Limitations

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Tip

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OpenTelemetry Instrumentation Not Working for Apache Spark Structured Streaming Consumers #14372

Description

Is your feature request related to a problem? Please describe.

📋 Issue Summary

🔍 Problem Description

Current Behavior

Impact

🛠️ Root Cause Analysis

Technical Challenges

Current Limitations

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Tip

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions