Skip to content

Commit 005b1c3

Browse files
M-HietalaMarko Hietala
andauthored
inference instrumentor to inference library (Azure#37846)
* inference instrumentor to inference library * updating samples readme * updating changelog * fixing tests * pyright related fixes * fixing pyright comments that another tool broke * pylint issue * tracing into own module * fixing verifytypes errors --------- Co-authored-by: Marko Hietala <[email protected]>
1 parent 897ce60 commit 005b1c3

17 files changed

+2208
-17
lines changed

.vscode/cspell.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -406,6 +406,9 @@
406406
"uamqp",
407407
"uksouth",
408408
"ukwest",
409+
"uninstrument",
410+
"uninstrumented",
411+
"uninstrumenting",
409412
"unpad",
410413
"unpadder",
411414
"unpartial",

sdk/ai/azure-ai-inference/CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
### Features Added
66

7+
* Support for tracing. Please find more information in the package [README.md](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/README.md).
8+
79
### Breaking Changes
810

911
### Bugs Fixed

sdk/ai/azure-ai-inference/README.md

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,14 @@ To update an existing installation of the package, use:
5757
pip install --upgrade azure-ai-inference
5858
```
5959

60+
If you want to install Azure AI Inferencing package with support for OpenTelemetry based tracing, use the following command:
61+
62+
```bash
63+
pip install azure-ai-inference[trace]
64+
```
65+
66+
67+
6068
## Key concepts
6169

6270
### Create and authenticate a client directly, using API key or GitHub token
@@ -530,6 +538,87 @@ For more information, see [Configure logging in the Azure libraries for Python](
530538

531539
To report issues with the client library, or request additional features, please open a GitHub issue [here](https://github.com/Azure/azure-sdk-for-python/issues)
532540

541+
## Tracing
542+
543+
The Azure AI Inferencing API Tracing library provides tracing for Azure AI Inference client library for Python. Refer to Installation chapter above for installation instructions.
544+
545+
### Setup
546+
547+
The environment variable AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED controls whether the actual message contents will be recorded in the traces or not. By default, the message contents are not recorded as part of the trace. When message content recording is disabled any function call tool related function names, function parameter names and function parameter values are also not recorded in the trace. Set the value of the environment variable to "true" (case insensitive) for the message contents to be recorded as part of the trace. Any other value will cause the message contents not to be recorded.
548+
549+
You also need to configure the tracing implementation in your code by setting `AZURE_SDK_TRACING_IMPLEMENTATION` to `opentelemetry` or configuring it in the code with the following snippet:
550+
551+
<!-- SNIPPET:sample_chat_completions_with_tracing.trace_setting -->
552+
553+
```python
554+
from azure.core.settings import settings
555+
settings.tracing_implementation = "opentelemetry"
556+
```
557+
558+
<!-- END SNIPPET -->
559+
560+
Please refer to [azure-core-tracing-documentation](https://learn.microsoft.com/python/api/overview/azure/core-tracing-opentelemetry-readme) for more information.
561+
562+
### Exporting Traces with OpenTelemetry
563+
564+
Azure AI Inference is instrumented with OpenTelemetry. In order to enable tracing you need to configure OpenTelemetry to export traces to your observability backend.
565+
Refer to [Azure SDK tracing in Python](https://learn.microsoft.com/python/api/overview/azure/core-tracing-opentelemetry-readme?view=azure-python-preview) for more details.
566+
567+
Refer to [Azure Monitor OpenTelemetry documentation](https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-enable?tabs=python) for the details on how to send Azure AI Inference traces to Azure Monitor and create Azure Monitor resource.
568+
569+
### Instrumentation
570+
571+
Use the AIInferenceInstrumentor to instrument the Azure AI Inferencing API for LLM tracing, this will cause the LLM traces to be emitted from Azure AI Inferencing API.
572+
573+
<!-- SNIPPET:sample_chat_completions_with_tracing.instrument_inferencing -->
574+
575+
```python
576+
from azure.ai.inference.tracing import AIInferenceInstrumentor
577+
# Instrument AI Inference API
578+
AIInferenceInstrumentor().instrument()
579+
```
580+
581+
<!-- END SNIPPET -->
582+
583+
584+
It is also possible to uninstrument the Azure AI Inferencing API by using the uninstrument call. After this call, the traces will no longer be emitted by the Azure AI Inferencing API until instrument is called again.
585+
586+
<!-- SNIPPET:sample_chat_completions_with_tracing.uninstrument_inferencing -->
587+
588+
```python
589+
AIInferenceInstrumentor().uninstrument()
590+
```
591+
592+
<!-- END SNIPPET -->
593+
594+
### Tracing Your Own Functions
595+
The @tracer.start_as_current_span decorator can be used to trace your own functions. This will trace the function parameters and their values. You can also add further attributes to the span in the function implementation as demonstrated below. Note that you will have to setup the tracer in your code before using the decorator. More information is available [here](https://opentelemetry.io/docs/languages/python/).
596+
597+
<!-- SNIPPET:sample_chat_completions_with_tracing.trace_function -->
598+
599+
```python
600+
from opentelemetry.trace import get_tracer
601+
tracer = get_tracer(__name__)
602+
603+
# The tracer.start_as_current_span decorator will trace the function call and enable adding additional attributes
604+
# to the span in the function implementation. Note that this will trace the function parameters and their values.
605+
@tracer.start_as_current_span("get_temperature") # type: ignore
606+
def get_temperature(city: str) -> str:
607+
608+
# Adding attributes to the current span
609+
span = trace.get_current_span()
610+
span.set_attribute("requested_city", city)
611+
612+
if city == "Seattle":
613+
return "75"
614+
elif city == "New York City":
615+
return "80"
616+
else:
617+
return "Unavailable"
618+
```
619+
620+
<!-- END SNIPPET -->
621+
533622
## Next steps
534623

535624
* Have a look at the [Samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder, containing fully runnable Python code for doing inference using synchronous and asynchronous clients.

sdk/ai/azure-ai-inference/assets.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,5 @@
22
"AssetsRepo": "Azure/azure-sdk-assets",
33
"AssetsRepoPrefixPath": "python",
44
"TagPrefix": "python/ai/azure-ai-inference",
5-
"Tag": "python/ai/azure-ai-inference_498e85cbfd"
5+
"Tag": "python/ai/azure-ai-inference_19a0adafc6"
66
}

sdk/ai/azure-ai-inference/azure/ai/inference/_patch.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -102,8 +102,8 @@ def load_client(
102102
"The AI model information is missing a value for `model type`. Cannot create an appropriate client."
103103
)
104104

105-
# TODO: Remove "completions" and "embedding" once Mistral Large and Cohere fixes their model type
106-
if model_info.model_type in (_models.ModelType.CHAT, "completion"):
105+
# TODO: Remove "completions", "chat-comletions" and "embedding" once Mistral Large and Cohere fixes their model type
106+
if model_info.model_type in (_models.ModelType.CHAT, "completion", "chat-completion", "chat-completions"):
107107
chat_completion_client = ChatCompletionsClient(endpoint, credential, **kwargs)
108108
chat_completion_client._model_info = ( # pylint: disable=protected-access,attribute-defined-outside-init
109109
model_info
@@ -454,7 +454,7 @@ def complete(
454454
:raises ~azure.core.exceptions.HttpResponseError:
455455
"""
456456

457-
@distributed_trace
457+
# pylint:disable=client-method-missing-tracing-decorator
458458
def complete(
459459
self,
460460
body: Union[JSON, IO[bytes]] = _Unset,

sdk/ai/azure-ai-inference/azure/ai/inference/aio/_patch.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ async def load_client(
8787
)
8888

8989
# TODO: Remove "completions" and "embedding" once Mistral Large and Cohere fixes their model type
90-
if model_info.model_type in (_models.ModelType.CHAT, "completion"):
90+
if model_info.model_type in (_models.ModelType.CHAT, "completion", "chat-completion", "chat-completions"):
9191
chat_completion_client = ChatCompletionsClient(endpoint, credential, **kwargs)
9292
chat_completion_client._model_info = ( # pylint: disable=protected-access,attribute-defined-outside-init
9393
model_info
@@ -437,7 +437,7 @@ async def complete(
437437
:raises ~azure.core.exceptions.HttpResponseError:
438438
"""
439439

440-
@distributed_trace_async
440+
# pylint:disable=client-method-missing-tracing-decorator-async
441441
async def complete(
442442
self,
443443
body: Union[JSON, IO[bytes]] = _Unset,

sdk/ai/azure-ai-inference/azure/ai/inference/models/_patch.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
import re
1515
import sys
1616

17-
from typing import List, AsyncIterator, Iterator, Optional, Union
17+
from typing import Any, List, AsyncIterator, Iterator, Optional, Union
1818
from azure.core.rest import HttpResponse, AsyncHttpResponse
1919
from ._models import ImageUrl as ImageUrlGenerated
2020
from ._models import ChatCompletions as ChatCompletionsGenerated
@@ -200,7 +200,7 @@ def __init__(self, response: HttpResponse):
200200
self._response = response
201201
self._bytes_iterator: Iterator[bytes] = response.iter_bytes()
202202

203-
def __iter__(self):
203+
def __iter__(self) -> Any:
204204
return self
205205

206206
def __next__(self) -> "_models.StreamingChatCompletionsUpdate":
@@ -220,7 +220,7 @@ def _read_next_block(self) -> bool:
220220
return True
221221
return self._deserialize_and_add_to_queue(element)
222222

223-
def __exit__(self, exc_type, exc_val, exc_tb) -> None:
223+
def __exit__(self, exc_type: Any, exc_val: Any, exc_tb: Any) -> None: # type: ignore
224224
self.close()
225225

226226
def close(self) -> None:
@@ -239,7 +239,7 @@ def __init__(self, response: AsyncHttpResponse):
239239
self._response = response
240240
self._bytes_iterator: AsyncIterator[bytes] = response.iter_bytes()
241241

242-
def __aiter__(self):
242+
def __aiter__(self) -> Any:
243243
return self
244244

245245
async def __anext__(self) -> "_models.StreamingChatCompletionsUpdate":
@@ -259,7 +259,7 @@ async def _read_next_block_async(self) -> bool:
259259
return True
260260
return self._deserialize_and_add_to_queue(element)
261261

262-
def __exit__(self, exc_type, exc_val, exc_tb) -> None:
262+
def __exit__(self, exc_type: Any, exc_val: Any, exc_tb: Any) -> None: # type: ignore
263263
asyncio.run(self.aclose())
264264

265265
async def aclose(self) -> None:

0 commit comments

Comments
 (0)