Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -22,32 +22,31 @@ pip install elastic-opentelemetry-instrumentation-openai

This instrumentation supports *zero-code* / *autoinstrumentation*:

```
opentelemetry-instrument python use_openai.py
You can see telemetry from this package if you have an OpenTelemetry collector started, for example
as documented in the root [examples](../../examples/) folder.

# You can record more information about prompts as log events by enabling content capture.
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true opentelemetry-instrument python use_openai.py
Set up a virtual environment with this package, the dependencies it requires
and `dotenv` (a portable way to load environment variables).
```
python3 -m venv .venv
source .venv/bin/activate
pip install -r test-requirements.txt
pip install python-dotenv[cli]
```

Or manual instrumentation:

```python
import openai
from opentelemetry.instrumentation.openai import OpenAIInstrumentor

OpenAIInstrumentor().instrument()

# assumes at least the OPENAI_API_KEY environment variable set
client = openai.Client()
Run the script with telemetry setup to use the instrumentation. [ollama.env](ollama.env)
includes variables to point to Ollama instead of OpenAI, which allows you to
run examples without a cloud account:

messages = [
{
"role": "user",
"content": "Answer in up to 3 words: Which ocean contains the canarian islands?",
}
]
```
dotenv -f ollama.env run -- \
opentelemetry-instrument python examples/chat.py
```

chat_completion = client.chat.completions.create(model="gpt-4o-mini", messages=messages)
You can record more information about prompts as log events by enabling content capture.
```
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true dotenv -f ollama.env run -- \
opentelemetry-instrument python examples/chat.py
```

### Instrumentation specific environment variable configuration
Expand Down Expand Up @@ -110,20 +109,22 @@ response without querying the LLM.

### Azure OpenAI Environment Variables

Azure is different from OpenAI primarily in that a URL has an implicit model. This means it ignores
the model parameter set by the OpenAI SDK. The implication is that one endpoint cannot serve both
chat and embeddings at the same time. Hence, we need separate environment variables for chat and
embeddings. In either case, the `DEPLOYMENT_URL` is the "Endpoint Target URI" and the `API_KEY` is
the `Endpoint Key` for a corresponding deployment in https://oai.azure.com/resource/deployments

* `AZURE_CHAT_COMPLETIONS_DEPLOYMENT_URL`
* It should look like https://endpoint.com/openai/deployments/my-deployment/chat/completions?api-version=2023-05-15
* `AZURE_CHAT_COMPLETIONS_API_KEY`
* It should be in hex like `abc01...` and possibly the same as `AZURE_EMBEDDINGS_API_KEY`
* `AZURE_EMBEDDINGS_DEPLOYMENT_URL`
* It should look like https://endpoint.com/openai/deployments/my-deployment/embeddings?api-version=2023-05-15
* `AZURE_EMBEDDINGS_API_KEY`
* It should be in hex like `abc01...` and possibly the same as `AZURE_CHAT_COMPLETIONS_API_KEY`
The `AzureOpenAI` client extends `OpenAI` with parameters specific to the Azure OpenAI Service.

* `AZURE_OPENAI_ENDPOINT` - "Azure OpenAI Endpoint" in https://oai.azure.com/resource/overview
* It should look like `https://<your-resource-name>.openai.azure.com/`
* `AZURE_OPENAI_API_KEY` - "API key 1 (or 2)" in https://oai.azure.com/resource/overview
* It should look be a hex string like `abc01...`
* `OPENAI_API_VERSION` = "Inference version" from https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation
* It should look like `2024-10-01-preview`
* `TEST_CHAT_MODEL` = "Name" from https://oai.azure.com/resource/deployments that deployed a model
that supports tool calling, such as "gpt-4o-mini".
* `TEST_EMBEDDINGS_MODEL` = "Name" from https://oai.azure.com/resource/deployments that deployed a
model that supports embeddings, such as "text-embedding-3-small".

Note: The model parameter of a chat completion or embeddings request is substituted for an identical
deployment name. As deployment names are arbitrary they may have no correlation with a real model
like `gpt-4o`

## License

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import os

import openai

CHAT_MODEL = os.environ.get("TEST_CHAT_MODEL", "gpt-4o-mini")


def main():
client = openai.Client()

messages = [
{
"role": "user",
"content": "Answer in up to 3 words: Which ocean contains Bouvet Island?",
}
]

chat_completion = client.chat.completions.create(model=CHAT_MODEL, messages=messages)
print(chat_completion.choices[0].message.content)


if __name__ == "__main__":
main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import os

import numpy as np
import openai

EMBEDDINGS_MODEL = os.environ.get("TEST_EMBEDDINGS_MODEL", "text-embedding-3-small")


def main():
client = openai.Client()

products = [
"Search: Ingest your data, and explore Elastic's machine learning and retrieval augmented generation (RAG) capabilities."
"Observability: Unify your logs, metrics, traces, and profiling at scale in a single platform.",
"Security: Protect, investigate, and respond to cyber threats with AI-driven security analytics."
"Elasticsearch: Distributed, RESTful search and analytics.",
"Kibana: Visualize your data. Navigate the Stack.",
"Beats: Collect, parse, and ship in a lightweight fashion.",
"Connectors: Connect popular databases, file systems, collaboration tools, and more.",
"Logstash: Ingest, transform, enrich, and output.",
]

# Generate embeddings for each product. Keep them in an array instead of a vector DB.
product_embeddings = []
for product in products:
product_embeddings.append(create_embedding(client, product))

query_embedding = create_embedding(client, "What can help me connect to a database?")

# Calculate cosine similarity between the query and document embeddings
similarities = []
for product_embedding in product_embeddings:
similarity = np.dot(query_embedding, product_embedding) / (
np.linalg.norm(query_embedding) * np.linalg.norm(product_embedding)
)
similarities.append(similarity)

# Get the index of the most similar document
most_similar_index = np.argmax(similarities)

print(products[most_similar_index])


def create_embedding(client, text):
return client.embeddings.create(input=[text], model=EMBEDDINGS_MODEL, encoding_format="float").data[0].embedding


if __name__ == "__main__":
main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Env to run the integration tests against a local Ollama.
OPENAI_BASE_URL=http://127.0.0.1:11434/v1
OPENAI_API_KEY=notused

# These models may be substituted in the future with inexpensive to run, newer
# variants.
TEST_CHAT_MODEL=qwen2.5:0.5b
TEST_EMBEDDINGS_MODEL=all-minilm:33m

OTEL_SERVICE_NAME=elastic-opentelemetry-instrumentation-openai
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
import os
from dataclasses import dataclass

import openai
from opentelemetry.instrumentation.openai import OpenAIInstrumentor
from opentelemetry.metrics import Histogram
from vcr.unittest import VCRMixin

# Use the same model for tools as for chat completion
OPENAI_API_KEY = "test_openai_api_key"
OPENAI_ORG_ID = "test_openai_org_key"
OPENAI_PROJECT_ID = "test_openai_project_id"

LOCAL_MODEL = "qwen2.5:0.5b"


@dataclass
class OpenAIEnvironment:
# TODO: add system
operation_name: str = "chat"
model: str = "gpt-4o-mini"
response_model: str = "gpt-4o-mini-2024-07-18"
server_address: str = "api.openai.com"
server_port: int = 443


class OpenaiMixin(VCRMixin):
def _get_vcr_kwargs(self, **kwargs):
"""
This scrubs sensitive data and gunzips bodies when in recording mode.
Without this, you would leak cookies and auth tokens in the cassettes.
Also, depending on the request, some responses would be binary encoded
while others plain json. This ensures all bodies are human-readable.
"""
return {
"decode_compressed_response": True,
"filter_headers": [
("authorization", "Bearer " + OPENAI_API_KEY),
("openai-organization", OPENAI_ORG_ID),
("openai-project", OPENAI_PROJECT_ID),
("cookie", None),
],
"before_record_response": self.scrub_response_headers,
}

@staticmethod
def scrub_response_headers(response):
"""
This scrubs sensitive response headers. Note they are case-sensitive!
"""
response["headers"]["openai-organization"] = OPENAI_ORG_ID
response["headers"]["Set-Cookie"] = "test_set_cookie"
return response

@classmethod
def setup_client(cls):
# Control the arguments
return openai.Client(
api_key=os.getenv("OPENAI_API_KEY", OPENAI_API_KEY),
organization=os.getenv("OPENAI_ORG_ID", OPENAI_ORG_ID),
project=os.getenv("OPENAI_PROJECT_ID", OPENAI_PROJECT_ID),
max_retries=1,
)

@classmethod
def setup_environment(cls):
return OpenAIEnvironment()

@classmethod
def setUpClass(cls):
cls.client = cls.setup_client()
cls.openai_env = cls.setup_environment()

def setUp(self):
super().setUp()
OpenAIInstrumentor().instrument()

def tearDown(self):
super().tearDown()
OpenAIInstrumentor().uninstrument()

def assertOperationDurationMetric(self, metric: Histogram):
self.assertEqual(metric.name, "gen_ai.client.operation.duration")
self.assert_metric_expected(
metric,
[
self.create_histogram_data_point(
count=1,
sum_data_point=0.006543334107846022,
max_data_point=0.006543334107846022,
min_data_point=0.006543334107846022,
attributes={
"gen_ai.operation.name": self.openai_env.operation_name,
"gen_ai.request.model": self.openai_env.model,
"gen_ai.response.model": self.openai_env.response_model,
"gen_ai.system": "openai",
"server.address": self.openai_env.server_address,
"server.port": self.openai_env.server_port,
},
),
],
est_value_delta=0.2,
)

def assertErrorOperationDurationMetric(self, metric: Histogram, attributes: dict, data_point: float = None):
self.assertEqual(metric.name, "gen_ai.client.operation.duration")
default_attributes = {
"gen_ai.operation.name": self.openai_env.operation_name,
"gen_ai.request.model": self.openai_env.model,
"gen_ai.system": "openai",
"error.type": "APIConnectionError",
"server.address": "localhost",
"server.port": 9999,
}
if data_point is None:
data_point = 0.8643839359283447
self.assert_metric_expected(
metric,
[
self.create_histogram_data_point(
count=1,
sum_data_point=data_point,
max_data_point=data_point,
min_data_point=data_point,
attributes={**default_attributes, **attributes},
),
],
est_value_delta=0.5,
)

def assertTokenUsageInputMetric(self, metric: Histogram, input_data_point=4):
self.assertEqual(metric.name, "gen_ai.client.token.usage")
self.assert_metric_expected(
metric,
[
self.create_histogram_data_point(
count=1,
sum_data_point=input_data_point,
max_data_point=input_data_point,
min_data_point=input_data_point,
attributes={
"gen_ai.operation.name": self.openai_env.operation_name,
"gen_ai.request.model": self.openai_env.model,
"gen_ai.response.model": self.openai_env.response_model,
"gen_ai.system": "openai",
"server.address": self.openai_env.server_address,
"server.port": self.openai_env.server_port,
"gen_ai.token.type": "input",
},
),
],
)

def assertTokenUsageMetric(self, metric: Histogram, input_data_point=24, output_data_point=4):
self.assertEqual(metric.name, "gen_ai.client.token.usage")
self.assert_metric_expected(
metric,
[
self.create_histogram_data_point(
count=1,
sum_data_point=input_data_point,
max_data_point=input_data_point,
min_data_point=input_data_point,
attributes={
"gen_ai.operation.name": self.openai_env.operation_name,
"gen_ai.request.model": self.openai_env.model,
"gen_ai.response.model": self.openai_env.response_model,
"gen_ai.system": "openai",
"server.address": self.openai_env.server_address,
"server.port": self.openai_env.server_port,
"gen_ai.token.type": "input",
},
),
self.create_histogram_data_point(
count=1,
sum_data_point=output_data_point,
max_data_point=output_data_point,
min_data_point=output_data_point,
attributes={
"gen_ai.operation.name": self.openai_env.operation_name,
"gen_ai.request.model": self.openai_env.model,
"gen_ai.response.model": self.openai_env.response_model,
"gen_ai.system": "openai",
"server.address": self.openai_env.server_address,
"server.port": self.openai_env.server_port,
"gen_ai.token.type": "output",
},
),
],
)
Loading
Loading