DataDog
diff --git a/‎dd-java-agent/agent-profiling/profiling-otel/build.gradle.kts‎
Lines changed: 1 addition & 1 deletion b/‎dd-java-agent/agent-profiling/profiling-otel/build.gradle.kts‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎dd-java-agent/agent-profiling/profiling-otel/doc/ARCHITECTURE.md‎
Lines changed: 187 additions & 1 deletion b/‎dd-java-agent/agent-profiling/profiling-otel/doc/ARCHITECTURE.md‎
Lines changed: 187 additions & 1 deletion
@@ -76,7 +76,7 @@ tasks.named<JavaCompile>("compileJmhJava") {
 }
 
 dependencies {
-  implementation("io.btrace", "jafar-parser", "0.0.1-SNAPSHOT")
+  implementation(libs.jafar.parser)
   implementation(project(":internal-api"))
   implementation(project(":components:json"))
 
 
@@ -508,7 +508,193 @@ Potential improvements if cache effectiveness needs to be increased:
 
 This optimization targets the real bottleneck (redundant frame processing) rather than micro-optimizing already-efficient dictionary operations, resulting in measurable improvements for production workloads with realistic stack duplication patterns.
 
-### Phase 6: OTLP Compatibility Testing & Validation (Completed)
+### Phase 6: Original Payload Support (Completed)
+
+#### Objective
+
+Implement support for OTLP profiles `original_payload` and `original_payload_format` fields (fields 9 and 10) to include the source JFR recording(s) in OTLP output for debugging and compliance purposes.
+
+#### OTLP Specification Context
+
+Per [OTLP profiles.proto v1development](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/profiles/v1development/profiles.proto#L337):
+
+- `original_payload_format` (field 9): String indicating the format of the original recording (e.g., "jfr", "pprof")
+- `original_payload` (field 10): Raw bytes of the original profiling data
+
+**Note**: The OTLP spec recommends this feature be **disabled by default** due to payload size considerations. It is intended for debugging OTLP content and compliance verification, not routine production use.
+
+#### Implementation Details
+
+**API Design:**
+
+```java
+// Disabled by default per OTLP spec recommendation
+converter.setIncludeOriginalPayload(true)
+  .addFile(jfrFile, start, end)
+  .convert();
+```
+
+**Key Features:**
+
+1. **Zero-Copy Streaming** - JFR recordings are streamed directly into protobuf output without memory allocation:
+   - Single file: Direct `FileInputStream`
+   - Multiple files: `SequenceInputStream` chains files together
+   - Protobuf encoder streams data in 8KB chunks
+
+2. **Uber-JFR Concatenation** - Multiple JFR recordings are automatically concatenated:
+   - JFR format supports concatenation natively (multiple chunks in sequence)
+   - `SequenceInputStream` chains file streams using `Enumeration<InputStream>` wrapper
+   - Protobuf length-delimited encoding preserves total byte count
+
+3. **Enhanced ProtobufEncoder** - New streaming method for large payloads:
+   ```java
+   public void writeBytesField(int fieldNumber, InputStream inputStream, long length)
+       throws IOException
+   ```
+   - Properly encodes protobuf wire format (tag + varint length + data)
+   - Reads in chunks to avoid loading entire payload into memory
+   - Automatically closes InputStream when done
+
+4. **Profile Encoding Integration** - Modified `encodeProfile()` in JfrToOtlpConverter:
+   ```java
+   if (includeOriginalPayload && !pathEntries.isEmpty()) {
+     encoder.writeStringField(
+         OtlpProtoFields.Profile.ORIGINAL_PAYLOAD_FORMAT, "jfr");
+
+     // Calculate total size across all JFR files
+     long totalSize = 0;
+     for (PathEntry entry : pathEntries) {
+       totalSize += Files.size(entry.path);
+     }
+
+     // Stream concatenated JFR data directly into protobuf
+     encoder.writeBytesField(
+         OtlpProtoFields.Profile.ORIGINAL_PAYLOAD,
+         createJfrPayloadStream(),
+         totalSize);
+   }
+   ```
+
+5. **IOException Propagation** - Added IOException to method signatures:
+   - `encodeProfile()` throws IOException
+   - Wrapped in RuntimeException where called from lambdas (MessageWriter interface)
+
+#### Usage Examples
+
+**Single JFR File:**
+```java
+JfrToOtlpConverter converter = new JfrToOtlpConverter();
+byte[] otlpData = converter
+    .setIncludeOriginalPayload(true)
+    .addFile(Paths.get("profile.jfr"), startTime, endTime)
+    .convert();
+
+// Output includes:
+// - OTLP profile data (samples, dictionary, etc.)
+// - original_payload_format = "jfr"
+// - original_payload = <raw bytes of profile.jfr>
+```
+
+**Multiple JFR Files (Uber-JFR):**
+```java
+byte[] otlpData = converter
+    .setIncludeOriginalPayload(true)
+    .addFile(Paths.get("recording1.jfr"), start1, end1)
+    .addFile(Paths.get("recording2.jfr"), start2, end2)
+    .addFile(Paths.get("recording3.jfr"), start3, end3)
+    .convert();
+
+// original_payload contains concatenated bytes:
+// [recording1.jfr bytes][recording2.jfr bytes][recording3.jfr bytes]
+// This forms a valid JFR file with multiple chunks
+```
+
+**Converter Reuse:**
+```java
+// Setting persists across conversions until changed
+converter.setIncludeOriginalPayload(true);
+
+byte[] otlp1 = converter.addFile(file1, start1, end1).convert(); // includes payload
+byte[] otlp2 = converter.addFile(file2, start2, end2).convert(); // includes payload
+
+converter.setIncludeOriginalPayload(false);
+byte[] otlp3 = converter.addFile(file3, start3, end3).convert(); // no payload
+```
+
+#### Test Coverage
+
+Four comprehensive tests in `JfrToOtlpConverterSmokeTest.java`:
+
+1. **`convertWithOriginalPayloadDisabledByDefault()`**
+   - Verifies default behavior (payload not included)
+   - Baseline for size comparison
+
+2. **`convertWithOriginalPayloadEnabled()`**
+   - Single JFR file with payload enabled
+   - Validates: `resultSize >= jfrFileSize` (output contains at least the JFR bytes)
+
+3. **`convertMultipleRecordingsWithOriginalPayload()`**
+   - Three separate JFR files concatenated
+   - Validates: `resultSize >= (size1 + size2 + size3)` (uber-JFR concatenation)
+
+4. **`converterResetsOriginalPayloadSetting()`**
+   - Tests setting persistence across multiple `convert()` calls
+   - Verifies fluent API behavior and converter reuse
+
+**Size Validation Strategy**: Since we cannot easily parse protobuf bytes in tests, we validate by comparing output size. When `original_payload` is included, the total output size must be >= source JFR file size(s), as it contains both OTLP profile data AND the raw JFR bytes.
+
+#### Performance Characteristics
+
+**Memory Efficiency:**
+- **Streaming I/O**: No memory allocation for JFR content
+- Single-file optimization: Direct `FileInputStream` (no wrapper overhead)
+- Multi-file: `SequenceInputStream` chains streams (minimal overhead)
+- Chunk size: 8KB for streaming reads (balance between syscalls and memory)
+
+**Size Impact:**
+- Typical JFR file: 1-10 MB (compressed)
+- OTLP profile overhead: ~5-10% of JFR size (dictionary tables, samples)
+- Total output size: JFR size + OTLP overhead + protobuf framing (~3-5 bytes per field)
+
+**When to Enable:**
+- ✅ Debugging OTLP conversion issues
+- ✅ Compliance verification with external tools
+- ✅ Round-trip validation workflows (OTLP → JFR → OTLP)
+- ❌ Production profiling (unnecessary size overhead)
+- ❌ High-frequency uploads (bandwidth concerns)
+
+#### Design Decisions
+
+**Why SequenceInputStream?**
+- Standard library, no external dependencies
+- Designed specifically for chaining multiple streams
+- Lazy evaluation (only reads when data is consumed)
+- Zero memory overhead for stream chaining
+
+**Why not ByteArrayOutputStream concatenation?**
+- Would require loading all JFR files into memory
+- For 10 MB JFR files, this would allocate 10 MB per file
+- Streaming approach has O(1) memory regardless of JFR size
+
+**Why disabled by default?**
+- Per OTLP spec recommendation (size considerations)
+- Most use cases don't need the original payload
+- Opt-in design prevents accidental size bloat
+
+**Why calculate total size upfront?**
+- Protobuf length-delimited encoding requires size before data
+- `Files.size()` is fast (reads filesystem metadata, not content)
+- Alternative would require reading entire files twice (inefficient)
+
+#### Future Enhancements
+
+Potential improvements if needed:
+1. **Compression**: Gzip original_payload before encoding (OTLP allows this)
+2. **Selective inclusion**: Only include payload for certain profile types
+3. **Size limits**: Warn or skip if payload exceeds threshold
+4. **Format validation**: Verify JFR magic bytes before inclusion
+
+### Phase 7: OTLP Compatibility Testing & Validation (Completed)
 
 #### Objective
Original file line number	Diff line number	Diff line change
`@@ -76,7 +76,7 @@ tasks.named<JavaCompile>("compileJmhJava") {`
`76`	`76`	`}`
`77`	`77`
`78`	`78`	`dependencies {`
`79`		`- implementation("io.btrace", "jafar-parser", "0.0.1-SNAPSHOT")`
	`79`	`+ implementation(libs.jafar.parser)`
`80`	`80`	`implementation(project(":internal-api"))`
`81`	`81`	`implementation(project(":components:json"))`
`82`	`82`