Skip to content

Commit 90fc4b2

Browse files
[csharp sdk] TextAnalytics 20251115preview (#54033)
* initiate first version of sdk * adjust old tests * add new live tests for new features * updated samples * updated record tag * removed the seperate sln * updated tag * revert tsp-location in Conv authoring * run Export-API.ps1 * updated general readme * update comit id * update samples * updated snippets * update changelog
1 parent 452b66d commit 90fc4b2

File tree

58 files changed

+2713
-277
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+2713
-277
lines changed

sdk/cognitivelanguage/Azure.AI.Language.Text/Azure.AI.Language.Text.sln

Lines changed: 0 additions & 37 deletions
This file was deleted.

sdk/cognitivelanguage/Azure.AI.Language.Text/CHANGELOG.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
11
# Release History
22

3-
## 1.0.0-beta.4 (Unreleased)
3+
## 1.0.0-beta.4 (2025-11-20)
44

5-
### Features Added
6-
7-
### Breaking Changes
5+
This version of the client library defaults to the service API version `2025-11-15-preview`.
86

9-
### Bugs Fixed
7+
### Features Added
108

11-
### Other Changes
9+
- Added support for **multiple redaction policies** in a single request.
10+
- Added **synthetic replacement redaction**, enabling selected PII types (e.g., Person, Email) to be replaced with realistic synthetic values rather than masked.
11+
- Added **confidence score thresholding**, allowing customers to define minimum confidence levels—globally or per-entity—so that only entities meeting the required confidence are returned.
12+
- Added **entity validation control** with the `disableEntityValidation` parameter, allowing users to bypass entity validation when needed.
1213

1314
## 1.0.0-beta.3 (2025-06-23)
1415

sdk/cognitivelanguage/Azure.AI.Language.Text/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Text Analytics is part of the Azure Cognitive Service for Language, a cloud-base
1717
[Source code][source_root] | [Package (NuGet)][package]| [API reference documentation][text_refdocs] | [Product documentation][text_docs] | [Samples][source_samples]
1818

1919
> [!NOTE]
20-
> Text Authoring is not supported in version 2.0.0-beta.1. If you use Text Authoring, please continue to use version 1.1.0. You can find the [samples][textauthoring_samples] here.
20+
> Text Authoring is not supported from version 2.0.0-beta.1. If you use Text Authoring, please use the separate Text Authoring SDK. You can find the [samples][textauthoring_samples] here.
2121
2222
## Getting started
2323

@@ -31,6 +31,7 @@ dotnet add package Azure.AI.Language.Text --prerelease
3131

3232
|SDK version |Supported API version of service
3333
|-------------|------------------------------------------------------------------------------------------------
34+
|1.0.0-beta.4 | 2022-05-01, 2023-04-01, 2024-11-01, 2025-05-15-preview, 2025-11-15-preview (default)
3435
|1.0.0-beta.3 | 2022-05-01, 2023-04-01, 2024-11-01, 2024-11-15-preview, 2025-05-15-preview (default)
3536
|1.0.0-beta.2 | 2022-05-01, 2023-04-01, 2024-11-01, 2024-11-15-preview (default)
3637
|1.0.0-beta.1 | 2022-05-01, 2023-04-01, 2023-11-15-preview (default)

sdk/cognitivelanguage/Azure.AI.Language.Text/api/Azure.AI.Language.Text.net8.0.cs

Lines changed: 109 additions & 12 deletions
Large diffs are not rendered by default.

sdk/cognitivelanguage/Azure.AI.Language.Text/api/Azure.AI.Language.Text.netstandard2.0.cs

Lines changed: 109 additions & 12 deletions
Large diffs are not rendered by default.

sdk/cognitivelanguage/Azure.AI.Language.Text/assets.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,5 @@
22
"AssetsRepo": "Azure/azure-sdk-assets",
33
"AssetsRepoPrefixPath": "net",
44
"TagPrefix": "net/cognitivelanguage/Azure.AI.Language.Text",
5-
"Tag": "net/cognitivelanguage/Azure.AI.Language.Text_66a454ef6f"
5+
"Tag": "net/cognitivelanguage/Azure.AI.Language.Text_48cbf61c49"
66
}

sdk/cognitivelanguage/Azure.AI.Language.Text/samples/Sample1_AnalyzeTextAsync_LanguageDetection.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ The values of the `endpoint` and `apiKey` variables can be retrieved from enviro
1717

1818
## Detect the language of documents
1919

20-
To detect the language of a document, call `AnalyzeText` on the `TextAnalysisClient`, which returns a `AnalyzeTextLanguageDetectionResult` object with the name of the language, a confidence score, and more.
20+
To detect the language of a document, call `AnalyzeTextAsync` on the `TextAnalysisClient`, which returns a `AnalyzeTextLanguageDetectionResult` object with the name of the language, a confidence score, and more.
2121

2222
```C# Snippet:Sample1_AnalyzeTextAsync_LanguageDetection
2323
string textA =
@@ -81,11 +81,11 @@ catch (RequestFailedException exception)
8181
}
8282
```
8383

84-
To detect the language of a document, call `AnalyzeText` on the `TextAnalysisClient`, which returns a `AnalyzeTextLanguageDetectionResult` object with the name of the language, a confidence score, and more.
84+
To detect the language of a document, call `AnalyzeTextAsync` on the `TextAnalysisClient`, which returns a `AnalyzeTextLanguageDetectionResult` object with the name of the language, a confidence score, and more.
8585

8686
## Detect the language of documents with country hints
8787

88-
If the country where a document originates from is known, you can aid the language detection model if you call `AnalyzeText` on the `TextAnalysisClient` while passing the documents as an `IEnumerable<LanguageInput>` parameter, having set the `CountryHint` property on each `LanguageInput` object accordingly.
88+
If the country where a document originates from is known, you can aid the language detection model if you call `AnalyzeTextAsync` on the `TextAnalysisClient` while passing the documents as an `IEnumerable<LanguageInput>` parameter, having set the `CountryHint` property on each `LanguageInput` object accordingly.
8989

9090
```C# Snippet:Sample1_AnalyzeTextAsync_LanguageDetection_CountryHint
9191
string textA =

sdk/cognitivelanguage/Azure.AI.Language.Text/samples/Sample3_AnalyzeTextAsync_ExtractKeyPhrases.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ The values of the `endpoint` and `apiKey` variables can be retrieved from enviro
1717

1818
## Extracting key phrases from multiple documents
1919

20-
To extract key phrases from multiple documents, call `AnalyzeText` on an `AnalyzeTextInput`. The results are returned as a `AnalyzeTextKeyPhraseResult`.
20+
To extract key phrases from multiple documents, call `AnalyzeTextAsync` on an `AnalyzeTextInput`. The results are returned as a `AnalyzeTextKeyPhraseResult`.
2121

2222
```C# Snippet:Sample3_AnalyzeTextAsync_ExtractKeyPhrases
2323
string textA =

sdk/cognitivelanguage/Azure.AI.Language.Text/samples/Sample4_AnalyzeTextAsync_RecognizeEntities.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ The values of the `endpoint` and `apiKey` variables can be retrieved from enviro
1717

1818
## Recognize entities in multiple documents
1919

20-
To recognize entities in multiple documents, call `AnalyzeText` on the `TextAnalysisClient` by passing the documents as an `AnalyzeTextInput` parameter. This returns a `AnalyzeTextEntitiesResult`.
20+
To recognize entities in multiple documents, call `AnalyzeTextAsync` on the `TextAnalysisClient` by passing the documents as an `AnalyzeTextInput` parameter. This returns a `AnalyzeTextEntitiesResult`.
2121

2222
```C# Snippet:Sample4_AnalyzeTextAsync_RecognizeEntities
2323
string textA =
@@ -102,12 +102,12 @@ For more information on the changes in the preview API version, see [Preview API
102102

103103
To create a new `TextAnalysisClient` with the preview API version, you will need the service endpoint and credentials of your Language resource with the `TextAnalysisClientOptions` pointing to the preview API Version.
104104

105-
To recognize entities in multiple documents, call `AnalyzeText` on the `TextAnalysisClient` by passing the documents and actionContent as an `AnalyzeTextInput` parameter. This returns a `AnalyzeTextEntitiesResult`.
105+
To recognize entities in multiple documents, call `AnalyzeTextAsync` on the `TextAnalysisClient` by passing the documents and actionContent as an `AnalyzeTextInput` parameter. This returns a `AnalyzeTextEntitiesResult`.
106106

107107
```C# Snippet:Sample4_AnalyzeTextAsync_RecognizeEntities_Preview
108108
Uri endpoint = TestEnvironment.Endpoint;
109109
AzureKeyCredential credential = new(TestEnvironment.ApiKey);
110-
TextAnalysisClientOptions options = new TextAnalysisClientOptions(TextAnalysisClientOptions.ServiceVersion.V2024_11_15_Preview);
110+
TextAnalysisClientOptions options = new TextAnalysisClientOptions(TextAnalysisClientOptions.ServiceVersion.V2025_11_15_Preview);
111111
var client = new TextAnalysisClient(endpoint, credential, options);
112112

113113
string textA =

sdk/cognitivelanguage/Azure.AI.Language.Text/samples/Sample5_AnalyzeTextAsync_RecognizePiiEntities.md

Lines changed: 162 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ The values of the `endpoint` and `apiKey` variables can be retrieved from enviro
1717

1818
## Recognizing Personally Identifiable Information in multiple documents
1919

20-
To recognize Personally Identifiable Information in multiple documents, call `AnalyzeText` on an `TextPiiEntitiesRecognitionInput`. The results are returned as a `AnalyzeTextPiiResult`.
20+
To recognize Personally Identifiable Information in multiple documents, call `AnalyzeTextAsync` on an `TextPiiEntitiesRecognitionInput`. The results are returned as a `AnalyzeTextPiiResult`.
2121

2222
```C# Snippet:Sample5_AnalyzeTextAsync_RecognizePii
2323
string textA =
@@ -84,7 +84,7 @@ foreach (DocumentError analyzeTextDocumentError in piiTaskResult.Results.Errors)
8484

8585
## Recognizing Personally Identifiable Information in multiple documents with a redaction policy
8686

87-
To recognize Personally Identifiable Information in multiple documents, call `AnalyzeText` on an `TextPiiEntitiesRecognitionInput`. The results are returned as a `AnalyzeTextPiiResult`.
87+
To recognize Personally Identifiable Information in multiple documents, call `AnalyzeTextAsync` on an `TextPiiEntitiesRecognitionInput`. The results are returned as a `AnalyzeTextPiiResult`.
8888

8989
```C# Snippet:Sample5_AnalyzeTextAsync_RecognizePii_RedactionPolicy
9090
string textA =
@@ -114,7 +114,15 @@ AnalyzeTextInput body = new TextPiiEntitiesRecognitionInput()
114114
{
115115
ModelVersion = "latest",
116116
// Avaliable RedactionPolicies: EntityMaskPolicyType, CharacterMaskPolicyType, and NoMaskPolicyType
117-
RedactionPolicy = new EntityMaskPolicyType()
117+
RedactionPolicies =
118+
{
119+
new EntityMaskPolicyType
120+
{
121+
// defaultPolicy: use entity mask for everything unless overridden
122+
PolicyName = "defaultPolicy",
123+
IsDefault = true,
124+
}
125+
}
118126
}
119127
};
120128

@@ -352,6 +360,157 @@ foreach (DocumentError error in result.Results.Errors)
352360
}
353361
```
354362

363+
## Recognizing Personally Identifiable Information with multiple redaction policies and with synthetic mask
364+
365+
To recognize Personally Identifiable Information with multiple redaction policies, call `AnalyzeTextAsync` on an `TextPiiEntitiesRecognitionInput`. The results are returned as a `AnalyzeTextPiiResult`.
366+
367+
```C# Snippet:Sample5_AnalyzeTextAsync_RecognizePii_RedactionPolicies
368+
string documentText = "My name is John Doe. My ssn is 123-45-6789. My email is [email protected]..";
369+
370+
AnalyzeTextInput body = new TextPiiEntitiesRecognitionInput
371+
{
372+
TextInput = new MultiLanguageTextInput
373+
{
374+
MultiLanguageInputs =
375+
{
376+
new MultiLanguageInput("A", documentText) { Language = "en" },
377+
new MultiLanguageInput("B", documentText) { Language = "en" },
378+
}
379+
},
380+
ActionContent = new PiiActionContent
381+
{
382+
PiiCategories = { PiiCategory.All },
383+
384+
RedactionPolicies =
385+
{
386+
new EntityMaskPolicyType
387+
{
388+
// defaultPolicy: use entity mask for everything unless overridden
389+
PolicyName = "defaultPolicy",
390+
IsDefault = true,
391+
},
392+
new CharacterMaskPolicyType
393+
{
394+
// customMaskForSSN: keep the last 4 digits of SSN, mask the rest
395+
PolicyName = "customMaskForSSN",
396+
UnmaskLength = 4,
397+
UnmaskFromEnd = false,
398+
EntityTypes =
399+
{
400+
PiiCategoriesExclude.UsSocialSecurityNumber
401+
},
402+
},
403+
new SyntheticReplacementPolicyType
404+
{
405+
// syntheticMaskForPerson: generate synthetic values for Person and Email
406+
PolicyName = "syntheticMaskForPerson",
407+
EntityTypes =
408+
{
409+
PiiCategoriesExclude.Person,
410+
PiiCategoriesExclude.Email
411+
},
412+
}
413+
}
414+
}
415+
};
416+
417+
Response<AnalyzeTextResult> response = await client.AnalyzeTextAsync(body);
418+
AnalyzeTextPiiResult piiTaskResult = (AnalyzeTextPiiResult)response.Value;
419+
420+
foreach (PiiActionResult piiResult in piiTaskResult.Results.Documents)
421+
{
422+
Console.WriteLine($"Result for document with Id = \"{piiResult.Id}\":");
423+
Console.WriteLine($" Redacted Text: \"{piiResult.RedactedText}\"");
424+
Console.WriteLine($" Recognized {piiResult.Entities.Count} entities:");
425+
426+
foreach (PiiEntity entity in piiResult.Entities)
427+
{
428+
Console.WriteLine($" Text: {entity.Text}");
429+
Console.WriteLine($" Offset: {entity.Offset}");
430+
Console.WriteLine($" Length: {entity.Length}");
431+
Console.WriteLine($" Category: {entity.Category}");
432+
if (!string.IsNullOrEmpty(entity.Subcategory))
433+
{
434+
Console.WriteLine($" SubCategory: {entity.Subcategory}");
435+
}
436+
Console.WriteLine($" Confidence score: {entity.ConfidenceScore}");
437+
Console.WriteLine();
438+
}
439+
440+
Console.WriteLine();
441+
}
442+
443+
foreach (DocumentError analyzeTextDocumentError in piiTaskResult.Results.Errors)
444+
{
445+
Console.WriteLine($" Error on document {analyzeTextDocumentError.Id}!");
446+
Console.WriteLine($" Document error code: {analyzeTextDocumentError.Error.Code}");
447+
Console.WriteLine($" Message: {analyzeTextDocumentError.Error.Message}");
448+
Console.WriteLine();
449+
}
450+
```
451+
452+
## Recognizing Personally Identifiable Information with confidence score threshold
453+
454+
To recognize Personally Identifiable Information with confidence score threshold, call `AnalyzeTextAsync` on an `TextPiiEntitiesRecognitionInput`. The results are returned as a `AnalyzeTextPiiResult`.
455+
456+
```C# Snippet:Sample5_AnalyzeTextAsync_RecognizePii_ConfidenceScoreThreshold
457+
string text =
458+
"My name is John Doe. My ssn is 222-45-6789. My email is [email protected]. John Doe is my name.";
459+
460+
// Input documents
461+
MultiLanguageTextInput textInput = new MultiLanguageTextInput
462+
{
463+
MultiLanguageInputs =
464+
{
465+
new MultiLanguageInput("1", text) { Language = "en" }
466+
}
467+
};
468+
469+
// Confidence score overrides:
470+
// default = 0.3
471+
// SSN & Email overridden to 0.9 (so they get filtered out as entities)
472+
ConfidenceScoreThreshold confidenceThreshold = new ConfidenceScoreThreshold(0.3f);
473+
confidenceThreshold.Overrides.Add(
474+
new ConfidenceScoreThresholdOverride(
475+
value: 0.9f,
476+
entity: PiiCategory.UsSocialSecurityNumber.ToString()
477+
));
478+
confidenceThreshold.Overrides.Add(
479+
new ConfidenceScoreThresholdOverride(
480+
value: 0.9f,
481+
entity: PiiCategory.Email.ToString()
482+
));
483+
484+
PiiActionContent actionContent = new PiiActionContent
485+
{
486+
PiiCategories = { PiiCategory.All },
487+
DisableEntityValidation = true,
488+
ConfidenceScoreThreshold = confidenceThreshold
489+
};
490+
491+
AnalyzeTextInput body = new TextPiiEntitiesRecognitionInput
492+
{
493+
TextInput = textInput,
494+
ActionContent = actionContent
495+
};
496+
497+
Response<AnalyzeTextResult> response = await client.AnalyzeTextAsync(body);
498+
AnalyzeTextPiiResult piiResult = (AnalyzeTextPiiResult)response.Value;
499+
500+
PiiActionResult doc = piiResult.Results.Documents[0];
501+
502+
Console.WriteLine($"Redacted text: \"{doc.RedactedText}\"");
503+
Console.WriteLine("Recognized entities (after confidence score filtering):");
504+
505+
foreach (PiiEntity entity in doc.Entities)
506+
{
507+
Console.WriteLine($" Text: {entity.Text}");
508+
Console.WriteLine($" Category: {entity.Category}");
509+
Console.WriteLine($" Confidence score: {entity.ConfidenceScore}");
510+
Console.WriteLine();
511+
}
512+
```
513+
355514
See the [README] of the Text Analytics client library for more information, including useful links and instructions.
356515

357516
[DefaultAzureCredential]: https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/identity/Azure.Identity/README.md

0 commit comments

Comments
 (0)