diff --git a/content/en/blog/2024/otel-generative-ai/aspire-dashboard-content-capture.png b/content/en/blog/2024/otel-generative-ai/aspire-dashboard-content-capture.png new file mode 100644 index 000000000000..c850316c780f Binary files /dev/null and b/content/en/blog/2024/otel-generative-ai/aspire-dashboard-content-capture.png differ diff --git a/content/en/blog/2024/otel-generative-ai/aspire-dashboard-trace.png b/content/en/blog/2024/otel-generative-ai/aspire-dashboard-trace.png new file mode 100644 index 000000000000..f54909f61b86 Binary files /dev/null and b/content/en/blog/2024/otel-generative-ai/aspire-dashboard-trace.png differ diff --git a/content/en/blog/2024/otel-generative-ai/index.md b/content/en/blog/2024/otel-generative-ai/index.md new file mode 100644 index 000000000000..74b30fb61a03 --- /dev/null +++ b/content/en/blog/2024/otel-generative-ai/index.md @@ -0,0 +1,223 @@ +--- +title: OpenTelemetry for Generative AI +linkTitle: OTel for GenAI +date: 2024-12-05 +author: >- + [Drew Robbins](https://github.com/drewby) (Microsoft), [Liudmila + Molkova](https://github.com/lmolkova) (Microsoft) +issue: https://github.com/open-telemetry/opentelemetry.io/issues/5581 +sig: SIG GenAI Observability +cSpell:ignore: genai liudmila molkova +--- + +As organizations increasingly adopt Large Language Models (LLMs) and other +generative AI technologies, ensuring reliable performance, efficiency, and +safety is essential to meet user expectations, optimize resource costs, and +safeguard against unintended outputs. Effective observability for AI operations, +behaviors, and outcomes can help meet these goals. OpenTelemetry is being +enhanced to support these needs specifically for generative AI. + +Two primary assets are in development to make this possible: **Semantic +Conventions** and **Instrumentation Libraries**. The first instrumentation +library targets the +[OpenAI Python API library](https://pypi.org/project/openai/). + +[**Semantic Conventions**](/docs/concepts/semantic-conventions/) establish +standardized guidelines for how telemetry data is structured and collected +across platforms, defining inputs, outputs, and operational details. For +generative AI, these conventions streamline monitoring, troubleshooting, and +optimizing AI models by standardizing attributes such as model parameters, +response metadata, and token usage. This consistency supports better +observability across tools, environments, and APIs, helping organizations track +performance, cost, and safety with ease. + +The +[**Instrumentation Library**](/docs/specs/otel/overview/#instrumentation-libraries) +is being developed within the +[OpenTelemetry Python Contrib](https://github.com/open-telemetry/opentelemetry-python-contrib) +under +[instrumentation-genai](https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation-genai) +project to automate telemetry collection for generative AI applications. The +first release is a Python library for instrumenting OpenAI client calls. This +library captures spans and events, gathering essential data like model inputs, +response metadata, and token usage in a structured format. + +## Key Signals for Generative AI + +The [Semantic Conventions for Generative AI](/docs/specs/semconv/gen-ai/) focus +on capturing insights into AI model behavior through three primary signals: +[Traces](/docs/concepts/signals/traces/), +[Metrics](/docs/concepts/signals/metrics/), and +[Events](/docs/specs/otel/logs/event-api/). + +Together, these signals provide a comprehensive monitoring framework, enabling +better cost management, performance tuning, and request tracing. + +### Traces: Tracing Model Interactions + +Traces track each model interaction's lifecycle, covering input parameters (for +example, temperature, top_p) and response details like token count or errors. +They provide visibility into each request, aiding in identifying bottlenecks and +analyzing the impact of settings on model output. + +### Metrics: Monitoring Usage and Performance + +Metrics aggregate high-level indicators like request volume, latency, and token +counts, essential for managing costs and performance. This data is particularly +critical for API-dependent AI applications with rate limits and cost +considerations. + +### Events: Capturing Detailed Interactions + +Events log detailed moments during model execution, such as user prompts and +model responses, providing a granular view of model interactions. These insights +are invaluable for debugging and optimizing AI applications where unexpected +behaviors may arise. + +{{% alert title="Note" color="info" %}} Note that we decided to use +[events emitted](/docs/specs/otel/logs/api/#emit-an-event) with the +[Logs API](/docs/specs/otel/logs/api/) specification in the Semantic Conventions +for Generative AI. Events allows for us to define specific +[semantic conventions](/docs/specs/semconv/general/events/) for the user prompts +and model responses that we capture. This addition to the API is in development +and considered unstable.{{% /alert %}} + +### Extending Observability with Vendor-Specific Attributes + +The Semantic Conventions also define vendor-specific attributes for platforms +like OpenAI and Azure Inference API, ensuring telemetry captures both general +and provider-specific details. This added flexibility supports multi-platform +monitoring and in-depth insights. + +## Building the Python Instrumentation Library for OpenAI + +This Python-based library for OpenTelemetry captures key telemetry signals for +OpenAI models, providing developers with an out-of-the-box observability +solution tailored to AI workloads. The library, +[hosted within the OpenTelemetry Python Contrib repository](https://github.com/open-telemetry/opentelemetry-python-contrib/tree/opentelemetry-instrumentation-openai-v2%3D%3D2.0b0/instrumentation-genai/opentelemetry-instrumentation-openai-v2), +automatically collects telemetry from OpenAI model interactions, including +request and response metadata and token usage. + +As generative AI applications grow, additional instrumentation libraries for +other languages will follow, extending OpenTelemetry support across more tools +and environments. The current library's focus on OpenAI highlights its +popularity and demand within AI development, making it a valuable initial +implementation. + +### Example Usage + +Here's an example of using the OpenTelemetry Python library to monitor a +generative AI application with the OpenAI client. + +Install the OpenTelemetry dependencies: + +```shell +pip install opentelemetry-distro +opentelemetry-bootstrap -a install +``` + +Set the following environment variables, updating the endpoint and protocol as +appropriate: + +```shell +OPENAI_API_KEY= + +OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 +OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf +OTEL_SERVICE_NAME=python-opentelemetry-openai +OTEL_LOGS_EXPORTER=otlp_proto_http +OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true +# Set to false or remove to disable log events +OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true +``` + +Then include the following code in your Python application: + +```python +import os +from openai import OpenAI + +client = OpenAI() +chat_completion = client.chat.completions.create( + model=os.getenv("CHAT_MODEL", "gpt-4o-mini"), + messages=[ + { + "role": "user", + "content": "Write a short poem on OpenTelemetry.", + }, + ], +) +print(chat_completion.choices[0].message.content) +``` + +And then run the example using `opentelemetry-instrument`: + +```shell +opentelemetry-instrument python main.py +``` + +If you do not have a service running to collect telemetry, you can export to the +console using the following: + +```shell +opentelemetry-instrument --traces_exporter console --metrics_exporter console python main.py +``` + +There is a complete example +[available here](https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation-genai/opentelemetry-instrumentation-openai-v2/example). + +With this simple instrumentation, one can begin capture traces from their +generative AI application. Here is an example from the +[Aspire Dashboard](https://learn.microsoft.com/dotnet/aspire/fundamentals/dashboard/standalone?tabs=bash) +for local debugging. + +To start Jaeger, run the following `docker` command and open your web browser +the `localhost:18888`: + +```shell +docker run --rm -it -d -p 18888:18888 -p 4317:18889 -p 4318:18890 --name aspire-dashboard mcr.microsoft.com/dotnet/aspire-dashboard:9.0 +``` + +![Chat trace in Aspire Dashboard](aspire-dashboard-trace.png) + +Here is a similar trace captured in +[Jaeger](https://www.jaegertracing.io/docs/1.63/getting-started/#all-in-one). + +To start Jaeger, run the following `docker` command and open your web browser +the `localhost:16686`. + +```shell +docker run --rm -it -d -p 16686:16686 -p 4317:4317 -p 4318:4318 --name jaeger jaegertracing/all-in-one:latest +``` + +![Chat trace in Jaeger](jaeger-trace.png) + +It's also easy to capture the content history of the chat for debugging and +improving your application. Simply set the environment variable +`OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` as follows: + +```shell +export OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=True +``` + +This will turn on content capture which collects OpenTelemetry events containing +the payload: + +![Content Capture Aspire Dashboard](aspire-dashboard-content-capture.png) + +## Join Us in Shaping the Future of Generative AI Observability + +Community collaboration is key to OpenTelemetry's success. We invite developers, +AI practitioners, and organizations to contribute, share feedback, or +participate in discussions. Explore the OpenTelemetry Python Contrib project, +contribute code, or help shape observability for AI as it continues to evolve. + +We now have contributors from [Amazon](https://aws.amazon.com/), +[Elastic](https://www.elastic.co/), [Google](https://www.google.com/), +[IBM](https://www.ibm.com), [Langtrace](https://www.langtrace.ai/), +[Microsoft](https://www.microsoft.com/), [OpenLIT](https://openlit.io/), +[Scorecard](https://www.scorecard.io/), [Traceloop](https://www.traceloop.com/), +and more! + +You are welcome to join the community! More information can be found at the +[Generative AI Observability project page](https://github.com/open-telemetry/community/blob/main/projects/gen-ai.md). diff --git a/content/en/blog/2024/otel-generative-ai/jaeger-trace.png b/content/en/blog/2024/otel-generative-ai/jaeger-trace.png new file mode 100644 index 000000000000..ee56ab3ce9ae Binary files /dev/null and b/content/en/blog/2024/otel-generative-ai/jaeger-trace.png differ diff --git a/static/refcache.json b/static/refcache.json index 7a3e5a73648a..19b73292e68a 100644 --- a/static/refcache.json +++ b/static/refcache.json @@ -9391,6 +9391,10 @@ "StatusCode": 200, "LastSeen": "2024-10-09T10:20:06.931205+02:00" }, + "https://learn.microsoft.com/dotnet/aspire/fundamentals/dashboard/standalone": { + "StatusCode": 206, + "LastSeen": "2024-12-05T10:36:06.35615+01:00" + }, "https://learn.microsoft.com/dotnet/framework/migration-guide/how-to-determine-which-versions-are-installed#query-the-registry-using-code": { "StatusCode": 200, "LastSeen": "2024-12-04T08:46:58.531297473Z" @@ -10211,6 +10215,10 @@ "StatusCode": 206, "LastSeen": "2024-01-18T19:08:05.648675-05:00" }, + "https://openlit.io/": { + "StatusCode": 200, + "LastSeen": "2024-12-05T10:36:14.926178+01:00" + }, "https://openmetrics.io/": { "StatusCode": 206, "LastSeen": "2024-01-18T19:07:18.197228-05:00" @@ -12007,6 +12015,10 @@ "StatusCode": 206, "LastSeen": "2024-08-09T11:02:26.926617-04:00" }, + "https://pypi.org/project/openai/": { + "StatusCode": 206, + "LastSeen": "2024-12-05T10:36:04.457935+01:00" + }, "https://pypi.org/project/opentelemetry-api/": { "StatusCode": 206, "LastSeen": "2024-01-30T06:01:19.327156-05:00" @@ -13559,6 +13571,10 @@ "StatusCode": 206, "LastSeen": "2024-08-09T10:46:30.160571-04:00" }, + "https://www.google.com/": { + "StatusCode": 200, + "LastSeen": "2024-12-05T10:36:10.643508+01:00" + }, "https://www.graalvm.org/latest/reference-manual/native-image/": { "StatusCode": 206, "LastSeen": "2024-09-30T11:46:04.441837921+02:00" @@ -13607,6 +13623,10 @@ "StatusCode": 200, "LastSeen": "2024-01-30T16:15:04.543149-05:00" }, + "https://www.ibm.com": { + "StatusCode": 206, + "LastSeen": "2024-12-05T10:36:11.479738+01:00" + }, "https://www.ibm.com/docs/api/v1/content/SSYKE2_8.0.0/openj9/api/jdk8/jre/management/extension/com/ibm/lang/management/OperatingSystemMXBean.html": { "StatusCode": 206, "LastSeen": "2024-08-09T10:46:28.705852-04:00" @@ -13719,6 +13739,10 @@ "StatusCode": 206, "LastSeen": "2024-08-09T09:42:46.824519+02:00" }, + "https://www.jaegertracing.io/docs/1.63/getting-started/#all-in-one": { + "StatusCode": 206, + "LastSeen": "2024-12-05T10:36:07.90645+01:00" + }, "https://www.jaegertracing.io/docs/latest/apis/": { "StatusCode": 206, "LastSeen": "2024-01-18T19:37:16.697232-05:00" @@ -14011,6 +14035,10 @@ "StatusCode": 206, "LastSeen": "2024-04-19T07:13:43.941227206Z" }, + "https://www.langtrace.ai/": { + "StatusCode": 200, + "LastSeen": "2024-12-05T10:36:13.494149+01:00" + }, "https://www.linuxfoundation.org/legal/privacy-policy": { "StatusCode": 200, "LastSeen": "2024-01-30T16:04:05.250977-05:00" @@ -14619,6 +14647,10 @@ "StatusCode": 206, "LastSeen": "2024-01-30T15:25:04.905602-05:00" }, + "https://www.scorecard.io/": { + "StatusCode": 200, + "LastSeen": "2024-12-05T10:36:16.128367+01:00" + }, "https://www.selenium.dev/documentation/grid/advanced_features/observability/": { "StatusCode": 206, "LastSeen": "2024-01-30T16:05:03.991313-05:00" @@ -14707,6 +14739,10 @@ "StatusCode": 206, "LastSeen": "2024-01-30T05:18:08.486678-05:00" }, + "https://www.traceloop.com/": { + "StatusCode": 200, + "LastSeen": "2024-12-05T10:36:17.176601+01:00" + }, "https://www.typescriptlang.org/download": { "StatusCode": 206, "LastSeen": "2024-01-18T19:10:44.997912-05:00"