[WIP] Adapt to the latest input and output semantic specifications #14613

Cirilla-zmh · 2025-09-08T14:17:58Z

No description provided.

otelbot-java-instrumentation · 2025-09-08T14:22:13Z

🔧 The result from spotlessApply was committed to the PR branch.

Cirilla-zmh · 2025-09-08T14:19:46Z

.../java/io/opentelemetry/instrumentation/api/incubator/semconv/genai/messages/GenericPart.java

+ */
+@AutoValue
+@JsonClassDescription("Generic part")
+public abstract class GenericPart implements MessagePart {


jackson annotations or io.opentelemetry.api.common.Value?

Cirilla-zmh · 2025-09-08T14:25:31Z

...va/io/opentelemetry/instrumentation/api/incubator/semconv/genai/messages/OutputMessages.java

+   * @param chunkMessage the chunk message to append
+   * @return a new OutputMessages instance with the merged content
+   */
+  public OutputMessages merge(int index, OutputMessage chunkMessage) {


If possible, abstracting all StreamListener types into a single interface might be a better choice. It should have an onChunk(OutputMessages) method, an onEnd() method, and an onError(Throwable) method. We can complete the common collection logic here uniformly — such as the aggregation of OutputMessages, the captures of usage tokens and time per output chunk , etc.

How do you think about that? cc @anuraaga

Cirilla-zmh · 2025-09-08T14:34:30Z

.../main/java/io/opentelemetry/instrumentation/openai/v1_1/ChatCompletionMessagesConverter.java

+import javax.annotation.Nullable;
+
+// as a field of Attributes Extractor
+// replace the 'ChatCompletionEventsHelper'


By combining this instance into the GenAiAttributesExtractor, we can uniformly manage various collection switches in the AttributesExtractor — whether to collect chat history, whether to collect as span attributes or as logs, etc.

However, this also means that I might need to emit events in the AttributesExtractor, and I'm not sure if this is the right approach cc @trask. But according to the current semantic specification, we need to record some necessary attributes in events — and these attributes can almost only be obtained in the AttributesExtractor and SpanProcessor.

If we don't do this in the AttributesExtractor, the best approach might be to define an EventsExtractor in the Instrumenter and allow it to read the current attributes like:

public interface EventsExtractor { public Context onStart(Attributes startAttributes, Context context); public void onEnd(Attributes endAttributes, Context context); }

OperationListener maybe another nice choice.

Cirilla-zmh · 2025-09-08T14:36:59Z

.../main/java/io/opentelemetry/instrumentation/openai/v1_1/ChatCompletionMessagesConverter.java

+          OutputMessage.create(
+              Role.ASSISTANT,
+              messageParts,
+              choice.finishReason().asString()));


Each choice has a finish_reason, so which one should be recorded in the span's gen_ai.response.finish_reasons at this point?

Cirilla-zmh · 2025-09-08T14:39:38Z

.../main/java/io/opentelemetry/instrumentation/openai/v1_1/ChatCompletionMessagesConverter.java

+      } else if (msg.isUser()) {
+        inputMessages.append(InputMessage.create(Role.USER, contentToMessageParts(msg.asUser().content())));
+      } else if (msg.isAssistant()) {
+        ChatCompletionAssistantMessageParam assistantMsg = msg.asAssistant();


Should these parts be recorded in the same ChatMessage?

In multi-turn conversation scenarios, there may be multiple 'assistant messages' in the 'input messages' - should all of them be recorded? (I think we should truthfully record the input of each call, regardless of whether it's a multi-turn conversation or not.)

Cirilla-zmh · 2025-09-08T14:55:52Z

So far, spring-ai has provided proxies for many model service SDKs, such as openai, gemini, etc. Some of these may already have dedicated instrumentation, and if spring-ai instrumentation is enabled, it will generate two duplicate inference spans. Therefore, we have two choices:

Do not implement spring-ai instrumentation at the model layer, relying entirely on the SDK's native instrumentation. However, some additional information may not be accessible (such as metadata defined in spring ai).
Similar to spring-kafka, also implement instrumentation and add a ProcessTracingState. Suppress the SDK's native instrumentation through context:

opentelemetry-java-instrumentation/instrumentation/kafka/kafka-clients/kafka-clients-0.11/bootstrap/src/main/java/io/opentelemetry/javaagent/bootstrap/kafka/KafkaClientsConsumerProcessTracing.java

Lines 13 to 31 in 5307e20

    
           // have separate copies of helper classes. 
        
           public final class KafkaClientsConsumerProcessTracing { 
        
             private static final ThreadLocal<Boolean> wrappingEnabled = ThreadLocal.withInitial(() -> true); 
        
             private KafkaClientsConsumerProcessTracing() {} 
        
             public static boolean setEnabled(boolean enabled) { 
        
               boolean previous = wrappingEnabled.get(); 
        
               wrappingEnabled.set(enabled); 
        
               return previous; 
        
             } 
        
             public static boolean wrappingEnabled() { 
        
               return wrappingEnabled.get(); 
        
             } 
        
             public static BooleanSupplier wrappingEnabledSupplier() { 
        
               return KafkaClientsConsumerProcessTracing::wrappingEnabled; 
        
             }

Cirilla-zmh · 2025-10-22T07:49:00Z

see #15064

anuraaga

Really sorry for the delay, the notification made me realize I had a pending comment that never got sent. Basically the approach looks fine as long as Jackson is acceptable in the instrumentation API.

anuraaga · 2025-09-09T02:40:40Z

instrumentation-api-incubator/build.gradle.kts

  api("io.opentelemetry:opentelemetry-api-incubator")

  compileOnly("com.google.auto.value:auto-value-annotations")
+  compileOnly("com.fasterxml.jackson.core:jackson-databind")


I believe this would need to be implementation because the API doesn't imply using the javaagent. The extra dependency I guess would be a concern with doing JSON marshaling within the instrumentation API itself, though when conventions rely on JSON, maybe it's unavoidable

Yes. It may simplify the construction by defining these structured messages with json properties, which can also be seen in python-instrumentation: https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/util/opentelemetry-util-genai/src/opentelemetry/util/genai/types.py

Here I use compileOnly because openai has depended on it yet, I just want to show my intention. However, implementation may lead to dependency conflicts. Maybe we should define these messages object in 'java-tooling' module, which has a shaded jackson dependency.

Cirilla-zmh and others added 2 commits September 8, 2025 22:16

init genai messages

09d342b

./gradlew spotlessApply

9690b0a

Cirilla-zmh commented Sep 8, 2025

View reviewed changes

Cirilla-zmh closed this Oct 22, 2025

anuraaga reviewed Oct 22, 2025

View reviewed changes

[WIP] Adapt to the latest input and output semantic specifications #14613

[WIP] Adapt to the latest input and output semantic specifications #14613

Uh oh!

Conversation

Cirilla-zmh commented Sep 8, 2025

Uh oh!

otelbot-java-instrumentation bot commented Sep 8, 2025

Uh oh!

Cirilla-zmh Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Cirilla-zmh Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Cirilla-zmh Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Cirilla-zmh Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Cirilla-zmh Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Cirilla-zmh Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Cirilla-zmh commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cirilla-zmh commented Oct 22, 2025

Uh oh!

anuraaga left a comment

Choose a reason for hiding this comment

Uh oh!

anuraaga Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Cirilla-zmh Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Cirilla-zmh commented Sep 8, 2025 •

edited

Loading

Cirilla-zmh Oct 22, 2025 •

edited

Loading