Skip to content

Commit 4fb9a8a

Browse files
rishasuranajosiahvinsonCopilot
authored
[HealthDataAiService.Deid] 2025-07-15-preview version (#42371)
* Adding new API version support * Add samples and update READMEs * Update CHANGELOG.md * Update sdk/healthdataaiservices/azure-health-deidentification/CHANGELOG.md Co-authored-by: Copilot <[email protected]> * Updating test recordings * Update README and CHANGELOG * Update CHANGELOG * release date * Fix formatting --------- Co-authored-by: Josiah Vinson <[email protected]> Co-authored-by: Copilot <[email protected]>
1 parent 2e3b19a commit 4fb9a8a

33 files changed

+3796
-80
lines changed

sdk/healthdataaiservices/azure-health-deidentification/CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,12 @@
11
# Release History
22

3+
## 1.1.0b1 (2025-08-05)
4+
5+
### Features Added
6+
7+
- Added `SURROGATE_ONLY` operation type in `DeidentificationOperationType`, which returns output text where user-defined PHI entities are replaced with realistic replacement values.
8+
- Added `input_locale` parameter to `DeidentificationCustomizationOptions` to allow for specifying the locale of the input text for `TAG` and `REDACT` operations.
9+
310
## 1.0.0 (2025-05-19)
411

512
### Features Added
@@ -33,4 +40,4 @@
3340

3441
### Features Added
3542

36-
- Azure Health Deidentification client library
43+
- Azure Health Deidentification client library

sdk/healthdataaiservices/azure-health-deidentification/README.md

Lines changed: 51 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This package contains a client library for the de-identification service in Azur
44
enables users to tag, redact, or surrogate health data containing Protected Health Information (PHI).
55
For more on service functionality and important usage considerations, see [the de-identification service overview][product_documentation].
66

7-
This library support API versions `2024-11-15` and earlier.
7+
This library supports API versions `2025-07-15-preview` and earlier.
88

99
Use the client library for the de-identification service to:
1010
- Discover PHI in unstructured text
@@ -72,10 +72,11 @@ client = DeidentificationClient(endpoint, credential)
7272
## Key concepts
7373

7474
### De-identification operations:
75-
Given an input text, the de-identification service can perform three main operations:
75+
Given an input text, the de-identification service can perform four main operations:
7676
- `Tag` returns the category and location within the text of detected PHI entities.
77-
- `Redact` returns output text where detected PHI entities are replaced with placeholder text. For example `John` replaced with `[name]`.
77+
- `Redact` returns output text where detected PHI entities are replaced with placeholder text. For example, `John` would be replaced with `[name]`.
7878
- `Surrogate` returns output text where detected PHI entities are replaced with realistic replacement values. For example, `My name is John Smith` could become `My name is Tom Jones`.
79+
- `SurrogateOnly` returns output text where user-defined PHI entities are replaced with realistic replacement values.
7980

8081
### String Encoding
8182
When using the `Tag` operation, the service will return the locations of PHI entities in the input text. These locations will be represented as offsets and lengths, each of which is a [StringIndex][string_index] containing
@@ -197,6 +198,7 @@ The following sections provide code samples covering some of the most common cli
197198
- [Discover PHI in unstructured text](#discover-phi-in-unstructured-text)
198199
- [Replace PHI in unstructured text with placeholder values](#replace-phi-in-unstructured-text-with-placeholder-values)
199200
- [Replace PHI in unstructured text with realistic surrogate values](#replace-phi-in-unstructured-text-with-realistic-surrogate-values)
201+
- [Replace only specific PHI entities with surrogate values](#replace-only-specific-phi-entities-with-surrogate-values)
200202

201203
See the [samples][samples] for code files illustrating common patterns, including creating and managing jobs to de-identify documents in Azure Storage.
202204

@@ -292,6 +294,52 @@ print(f'Surrogated Text: "{result.output_text}"') # Surrogated output: Hello,
292294

293295
<!-- END SNIPPET -->
294296

297+
### Replace only specific PHI entities with surrogate values
298+
The `SURROGATE_ONLY` operation returns output text where user-defined PHI entities are replaced with realistic replacement values.
299+
<!-- SNIPPET: deidentify_text_surrogate_only.surrogate_only -->
300+
301+
```python
302+
from azure.health.deidentification import DeidentificationClient
303+
from azure.health.deidentification.models import (
304+
DeidentificationContent,
305+
DeidentificationCustomizationOptions,
306+
DeidentificationOperationType,
307+
DeidentificationResult,
308+
PhiCategory,
309+
SimplePhiEntity,
310+
TaggedPhiEntities,
311+
TextEncodingType,
312+
)
313+
from azure.identity import DefaultAzureCredential
314+
import os
315+
316+
endpoint = os.environ["HEALTHDATAAISERVICES_DEID_SERVICE_ENDPOINT"]
317+
credential = DefaultAzureCredential()
318+
client = DeidentificationClient(endpoint, credential)
319+
320+
# Define the entities to be surrogated - targeting "John Smith" at position 18-28
321+
tagged_entities = TaggedPhiEntities(
322+
encoding=TextEncodingType.CODE_POINT,
323+
entities=[SimplePhiEntity(category=PhiCategory.PATIENT, offset=18, length=10)],
324+
)
325+
326+
# Use SurrogateOnly operation with input locale specification
327+
body = DeidentificationContent(
328+
input_text="Hello, my name is John Smith.",
329+
operation_type=DeidentificationOperationType.SURROGATE_ONLY,
330+
tagged_entities=tagged_entities,
331+
customizations=DeidentificationCustomizationOptions(
332+
input_locale="en-US" # Specify input text locale for better PHI detection
333+
),
334+
)
335+
336+
result: DeidentificationResult = client.deidentify_text(body)
337+
print(f'\nOriginal Text: "{body.input_text}"')
338+
print(f'Surrogate Only Text: "{result.output_text}"') # Surrogated output: Hello, my name is <synthetic name>.
339+
```
340+
341+
<!-- END SNIPPET -->
342+
295343
### Troubleshooting
296344
The `DeidentificationClient` raises various `AzureError` [exceptions][azure_error]. For example, if you
297345
provide an invalid service URL, an `ServiceRequestError` would be raised with a message indicating the failure cause.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"apiVersion": "2025-07-15-preview"
3+
}

sdk/healthdataaiservices/azure-health-deidentification/apiview-properties.json

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,16 +11,24 @@
1111
"azure.health.deidentification.models.DeidentificationResult": "HealthDataAIServices.DeidServices.DeidentificationResult",
1212
"azure.health.deidentification.models.PhiEntity": "HealthDataAIServices.DeidServices.PhiEntity",
1313
"azure.health.deidentification.models.PhiTaggerResult": "HealthDataAIServices.DeidServices.PhiTaggerResult",
14+
"azure.health.deidentification.models.SimplePhiEntity": "HealthDataAIServices.DeidServices.SimplePhiEntity",
1415
"azure.health.deidentification.models.SourceStorageLocation": "HealthDataAIServices.DeidServices.SourceStorageLocation",
1516
"azure.health.deidentification.models.StringIndex": "HealthDataAIServices.DeidServices.StringIndex",
17+
"azure.health.deidentification.models.TaggedPhiEntities": "HealthDataAIServices.DeidServices.TaggedPhiEntities",
1618
"azure.health.deidentification.models.TargetStorageLocation": "HealthDataAIServices.DeidServices.TargetStorageLocation",
1719
"azure.health.deidentification.models.DeidentificationOperationType": "HealthDataAIServices.DeidServices.DeidentificationOperationType",
1820
"azure.health.deidentification.models.OperationStatus": "Azure.Core.Foundations.OperationState",
1921
"azure.health.deidentification.models.PhiCategory": "HealthDataAIServices.DeidServices.PhiCategory",
22+
"azure.health.deidentification.models.TextEncodingType": "HealthDataAIServices.DeidServices.TextEncodingType",
2023
"azure.health.deidentification.DeidentificationClient.get_job": "HealthDataAIServices.DeidServices.getJob",
24+
"azure.health.deidentification.aio.DeidentificationClient.get_job": "HealthDataAIServices.DeidServices.getJob",
2125
"azure.health.deidentification.DeidentificationClient.begin_deidentify_documents": "HealthDataAIServices.DeidServices.deidentifyDocuments",
26+
"azure.health.deidentification.aio.DeidentificationClient.begin_deidentify_documents": "HealthDataAIServices.DeidServices.deidentifyDocuments",
2227
"azure.health.deidentification.DeidentificationClient.cancel_job": "HealthDataAIServices.DeidServices.cancelJob",
28+
"azure.health.deidentification.aio.DeidentificationClient.cancel_job": "HealthDataAIServices.DeidServices.cancelJob",
2329
"azure.health.deidentification.DeidentificationClient.delete_job": "HealthDataAIServices.DeidServices.deleteJob",
24-
"azure.health.deidentification.DeidentificationClient.deidentify_text": "HealthDataAIServices.DeidServices.deidentifyText"
30+
"azure.health.deidentification.aio.DeidentificationClient.delete_job": "HealthDataAIServices.DeidServices.deleteJob",
31+
"azure.health.deidentification.DeidentificationClient.deidentify_text": "HealthDataAIServices.DeidServices.deidentifyText",
32+
"azure.health.deidentification.aio.DeidentificationClient.deidentify_text": "HealthDataAIServices.DeidServices.deidentifyText"
2533
}
2634
}

sdk/healthdataaiservices/azure-health-deidentification/assets.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,5 @@
22
"AssetsRepo": "Azure/azure-sdk-assets",
33
"AssetsRepoPrefixPath": "python",
44
"TagPrefix": "python/healthdataaiservices/azure-health-deidentification",
5-
"Tag": "python/healthdataaiservices/azure-health-deidentification_a9eda6ed27"
5+
"Tag": "python/healthdataaiservices/azure-health-deidentification_db46426ffa"
66
}

sdk/healthdataaiservices/azure-health-deidentification/azure/health/deidentification/_client.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,22 +15,23 @@
1515
from azure.core.rest import HttpRequest, HttpResponse
1616

1717
from ._configuration import DeidentificationClientConfiguration
18-
from ._operations import DeidentificationClientOperationsMixin
19-
from ._serialization import Deserializer, Serializer
18+
from ._operations import _DeidentificationClientOperationsMixin
19+
from ._utils.serialization import Deserializer, Serializer
2020

2121
if TYPE_CHECKING:
2222
from azure.core.credentials import TokenCredential
2323

2424

25-
class DeidentificationClient(DeidentificationClientOperationsMixin):
25+
class DeidentificationClient(_DeidentificationClientOperationsMixin):
2626
"""DeidentificationClient.
2727
2828
:param endpoint: Url of your De-identification Service. Required.
2929
:type endpoint: str
3030
:param credential: Credential used to authenticate requests to the service. Required.
3131
:type credential: ~azure.core.credentials.TokenCredential
32-
:keyword api_version: The API version to use for this operation. Default value is "2024-11-15".
33-
Note that overriding this default value may result in unsupported behavior.
32+
:keyword api_version: The API version to use for this operation. Default value is
33+
"2025-07-15-preview". Note that overriding this default value may result in unsupported
34+
behavior.
3435
:paramtype api_version: str
3536
:keyword int polling_interval: Default waiting time between two polls for LRO operations if no
3637
Retry-After header is present.

sdk/healthdataaiservices/azure-health-deidentification/azure/health/deidentification/_configuration.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,13 +26,14 @@ class DeidentificationClientConfiguration: # pylint: disable=too-many-instance-
2626
:type endpoint: str
2727
:param credential: Credential used to authenticate requests to the service. Required.
2828
:type credential: ~azure.core.credentials.TokenCredential
29-
:keyword api_version: The API version to use for this operation. Default value is "2024-11-15".
30-
Note that overriding this default value may result in unsupported behavior.
29+
:keyword api_version: The API version to use for this operation. Default value is
30+
"2025-07-15-preview". Note that overriding this default value may result in unsupported
31+
behavior.
3132
:paramtype api_version: str
3233
"""
3334

3435
def __init__(self, endpoint: str, credential: "TokenCredential", **kwargs: Any) -> None:
35-
api_version: str = kwargs.pop("api_version", "2024-11-15")
36+
api_version: str = kwargs.pop("api_version", "2025-07-15-preview")
3637

3738
if endpoint is None:
3839
raise ValueError("Parameter 'endpoint' must not be None.")

sdk/healthdataaiservices/azure-health-deidentification/azure/health/deidentification/_model_base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# pylint: disable=too-many-lines
1+
# pylint: disable=too-many-lines,line-too-long,useless-suppression
22
# coding=utf-8
33
# --------------------------------------------------------------------------
44
# Copyright (c) Microsoft Corporation. All rights reserved.

sdk/healthdataaiservices/azure-health-deidentification/azure/health/deidentification/_operations/__init__.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,12 @@
1212
if TYPE_CHECKING:
1313
from ._patch import * # pylint: disable=unused-wildcard-import
1414

15-
from ._operations import DeidentificationClientOperationsMixin # type: ignore
15+
from ._operations import _DeidentificationClientOperationsMixin # type: ignore # pylint: disable=unused-import
1616

1717
from ._patch import __all__ as _patch_all
1818
from ._patch import *
1919
from ._patch import patch_sdk as _patch_sdk
2020

21-
__all__ = [
22-
"DeidentificationClientOperationsMixin",
23-
]
21+
__all__ = []
2422
__all__.extend([p for p in _patch_all if p not in __all__]) # pyright: ignore
2523
_patch_sdk()

sdk/healthdataaiservices/azure-health-deidentification/azure/health/deidentification/_operations/_operations.py

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,10 @@
88
from collections.abc import MutableMapping
99
from io import IOBase
1010
import json
11-
from typing import Any, Callable, Dict, IO, Iterable, Iterator, List, Optional, TypeVar, Union, cast, overload
11+
from typing import Any, Callable, Dict, IO, Iterator, List, Optional, TypeVar, Union, cast, overload
1212
import urllib.parse
1313

14+
from azure.core import PipelineClient
1415
from azure.core.exceptions import (
1516
ClientAuthenticationError,
1617
HttpResponseError,
@@ -30,9 +31,10 @@
3031
from azure.core.utils import case_insensitive_dict
3132

3233
from .. import models as _models
33-
from .._model_base import SdkJSONEncoder, _deserialize
34-
from .._serialization import Serializer
35-
from .._vendor import DeidentificationClientMixinABC
34+
from .._configuration import DeidentificationClientConfiguration
35+
from .._utils.model_base import SdkJSONEncoder, _deserialize
36+
from .._utils.serialization import Serializer
37+
from .._utils.utils import ClientMixinABC
3638

3739
JSON = MutableMapping[str, Any]
3840
T = TypeVar("T")
@@ -46,7 +48,7 @@ def build_deidentification_get_job_request(job_name: str, **kwargs: Any) -> Http
4648
_headers = case_insensitive_dict(kwargs.pop("headers", {}) or {})
4749
_params = case_insensitive_dict(kwargs.pop("params", {}) or {})
4850

49-
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2024-11-15"))
51+
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2025-07-15-preview"))
5052
accept = _headers.pop("Accept", "application/json")
5153

5254
# Construct URL
@@ -73,7 +75,7 @@ def build_deidentification_deidentify_documents_request( # pylint: disable=name
7375
_params = case_insensitive_dict(kwargs.pop("params", {}) or {})
7476

7577
content_type: Optional[str] = kwargs.pop("content_type", _headers.pop("Content-Type", None))
76-
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2024-11-15"))
78+
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2025-07-15-preview"))
7779
accept = _headers.pop("Accept", "application/json")
7880

7981
# Construct URL
@@ -101,7 +103,7 @@ def build_deidentification_list_jobs_internal_request( # pylint: disable=name-t
101103
_headers = case_insensitive_dict(kwargs.pop("headers", {}) or {})
102104
_params = case_insensitive_dict(kwargs.pop("params", {}) or {})
103105

104-
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2024-11-15"))
106+
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2025-07-15-preview"))
105107
accept = _headers.pop("Accept", "application/json")
106108

107109
# Construct URL
@@ -132,7 +134,7 @@ def build_deidentification_list_job_documents_internal_request( # pylint: disab
132134
_headers = case_insensitive_dict(kwargs.pop("headers", {}) or {})
133135
_params = case_insensitive_dict(kwargs.pop("params", {}) or {})
134136

135-
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2024-11-15"))
137+
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2025-07-15-preview"))
136138
accept = _headers.pop("Accept", "application/json")
137139

138140
# Construct URL
@@ -164,7 +166,7 @@ def build_deidentification_cancel_job_request( # pylint: disable=name-too-long
164166
_headers = case_insensitive_dict(kwargs.pop("headers", {}) or {})
165167
_params = case_insensitive_dict(kwargs.pop("params", {}) or {})
166168

167-
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2024-11-15"))
169+
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2025-07-15-preview"))
168170
accept = _headers.pop("Accept", "application/json")
169171

170172
# Construct URL
@@ -190,7 +192,7 @@ def build_deidentification_delete_job_request( # pylint: disable=name-too-long
190192
_headers = case_insensitive_dict(kwargs.pop("headers", {}) or {})
191193
_params = case_insensitive_dict(kwargs.pop("params", {}) or {})
192194

193-
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2024-11-15"))
195+
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2025-07-15-preview"))
194196
accept = _headers.pop("Accept", "application/json")
195197

196198
# Construct URL
@@ -215,7 +217,7 @@ def build_deidentification_deidentify_text_request(**kwargs: Any) -> HttpRequest
215217
_params = case_insensitive_dict(kwargs.pop("params", {}) or {})
216218

217219
content_type: Optional[str] = kwargs.pop("content_type", _headers.pop("Content-Type", None))
218-
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2024-11-15"))
220+
api_version: str = kwargs.pop("api_version", _params.pop("api-version", "2025-07-15-preview"))
219221
accept = _headers.pop("Accept", "application/json")
220222

221223
# Construct URL
@@ -232,7 +234,9 @@ def build_deidentification_deidentify_text_request(**kwargs: Any) -> HttpRequest
232234
return HttpRequest(method="POST", url=_url, params=_params, headers=_headers, **kwargs)
233235

234236

235-
class DeidentificationClientOperationsMixin(DeidentificationClientMixinABC):
237+
class _DeidentificationClientOperationsMixin(
238+
ClientMixinABC[PipelineClient[HttpRequest, HttpResponse], DeidentificationClientConfiguration]
239+
):
236240

237241
@distributed_trace
238242
def get_job(self, job_name: str, **kwargs: Any) -> _models.DeidentificationJob:
@@ -518,7 +522,7 @@ def get_long_running_output(pipeline_response):
518522
@distributed_trace
519523
def _list_jobs_internal(
520524
self, *, continuation_token_parameter: Optional[str] = None, **kwargs: Any
521-
) -> Iterable["_models.DeidentificationJob"]:
525+
) -> ItemPaged["_models.DeidentificationJob"]:
522526
"""List de-identification jobs.
523527
524528
Resource list operation template.
@@ -610,7 +614,7 @@ def get_next(next_link=None):
610614
@distributed_trace
611615
def _list_job_documents_internal(
612616
self, job_name: str, *, continuation_token_parameter: Optional[str] = None, **kwargs: Any
613-
) -> Iterable["_models.DeidentificationDocumentDetails"]:
617+
) -> ItemPaged["_models.DeidentificationDocumentDetails"]:
614618
"""List processed documents within a job.
615619
616620
Resource list operation template.

0 commit comments

Comments
 (0)