Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 12 additions & 10 deletions sdk/textanalytics/azure-ai-textanalytics/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Azure Text Analytics client library for Java
Text Analytics is a cloud-based service provides advanced natural language processing over raw text,
Text Analytics is a cloud-based service that provides advanced natural language processing over raw text,
and includes six main functions:

- Sentiment Analysis
Expand Down Expand Up @@ -187,7 +187,7 @@ The following sections provide several code snippets covering some of the most c
* [Detect Language](#detect-language "Detect language")
* [Extract Key Phrases](#extract-key-phrases "Extract key phrases")
* [Recognize Entities](#recognize-entities "Recognize entities")
* [Recognize Personally Identifiable Information Entities](#recognize-pii-entities "Recognize personally identifiable information entities")
* [Recognize Personally Identifiable Information Entities](#recognize-personally-identifiable-information-entities "Recognize Personally Identifiable Information entities")
* [Recognize Linked Entities](#recognize-linked-entities "Recognize linked entities")

### Text Analytics Client
Expand Down Expand Up @@ -264,20 +264,21 @@ textAnalyticsClient.recognizeEntities(document).forEach(entity ->
For samples on using the production recommended option `RecognizeEntitiesBatch` see [here][recognize_entities_sample].
Please refer to the service documentation for a conceptual discussion of [named entity recognition][named_entity_recognition].

### Recognize personally identifiable information entities
Run a predictive model to identify a collection of personally identifiable information entities in the passed-in
document or batch of documents and categorize those entities into categories such as person, location, or
organization. For more information on available categories, see [Text Analytics Named Entity Categories][named_entities_categories].
### Recognize Personally Identifiable Information entities
Run a predictive model to identify a collection of Personally Identifiable Information(PII) entities in the passed-in
document. It recognizes and categorizes PII entities in its input text, such as
Social Security Numbers, bank account information, credit card numbers, and more. This endpoint is only available for
v3.1-preview.1 and up.

<!-- embedme ./src/samples/java/com/azure/ai/textanalytics/ReadmeSamples.java#L158-L161 -->
```java
String document = "My SSN is 555-55-5555";
textAnalyticsClient.recognizePiiEntities(document).forEach(piiEntity ->
System.out.printf("Recognized Personally Identifiable Information entity: %s, category: %s, subCategory: %s, score: %f.%n",
piiEntity.getText(), piiEntity.getCategory(), piiEntity.getSubcategory(), piiEntity.getConfidenceScore()));
textAnalyticsClient.recognizePiiEntities(document).forEach(entity -> System.out.printf(
"Recognized Personally Identifiable Information entity: %s, entity category: %s, entity subcategory: %s, offset: %s, length: %s, confidence score: %f.%n",
entity.getText(), entity.getCategory(), entity.getSubcategory(), entity.getOffset(), entity.getLength(), entity.getConfidenceScore()));
```
For samples on using the production recommended option `RecognizePiiEntitiesBatch` see [here][recognize_pii_entities_sample].
Please refer to the service documentation for a conceptual discussion of [PII entity recognition][named_entity_recognition].
Please refer to the service documentation for [supported PII entity types][pii_entity_recognition].

### Recognize linked entities
Run a predictive model to identify a collection of entities found in the passed-in document or batch of documents,
Expand Down Expand Up @@ -374,6 +375,7 @@ This project has adopted the [Microsoft Open Source Code of Conduct][coc]. For m
[named_entity_recognition]: https://docs.microsoft.com/azure/cognitive-services/text-analytics/how-tos/text-analytics-how-to-entity-linking
[named_entity_recognition_types]: https://docs.microsoft.com/azure/cognitive-services/text-analytics/named-entity-types?tabs=personal
[named_entities_categories]: https://docs.microsoft.com/azure/cognitive-services/Text-Analytics/named-entity-types
[pii_entity_recognition]: https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/named-entity-types?tabs=personal
[package]: https://mvnrepository.com/artifact/com.azure/azure-ai-textanalytics
[performance_tuning]: https://github.com/Azure/azure-sdk-for-java/wiki/Performance-Tuning
[product_documentation]: https://docs.microsoft.com/azure/cognitive-services/text-analytics/overview
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
import com.azure.ai.textanalytics.models.TextDocumentInput;
import com.azure.ai.textanalytics.models.WarningCode;
import com.azure.ai.textanalytics.util.RecognizePiiEntitiesResultCollection;
import com.azure.core.exception.HttpResponseException;
import com.azure.core.http.rest.Response;
import com.azure.core.http.rest.SimpleResponse;
import com.azure.core.util.Context;
Expand All @@ -31,7 +30,6 @@
import java.util.stream.Collectors;

import static com.azure.ai.textanalytics.TextAnalyticsAsyncClient.COGNITIVE_TRACING_NAMESPACE_VALUE;
import static com.azure.ai.textanalytics.implementation.Utility.getEmptyErrorIdHttpResponse;
import static com.azure.ai.textanalytics.implementation.Utility.inputDocumentsValidation;
import static com.azure.ai.textanalytics.implementation.Utility.mapToHttpResponseExceptionIfExist;
import static com.azure.ai.textanalytics.implementation.Utility.toBatchStatistics;
Expand All @@ -44,15 +42,15 @@
import static com.azure.core.util.tracing.Tracer.AZ_TRACING_NAMESPACE_KEY;

/**
* Helper class for managing recognize personally identifiable information entity endpoint.
* Helper class for managing recognize Personally Identifiable Information entity endpoint.
*/
class RecognizePiiEntityAsyncClient {
private final ClientLogger logger = new ClientLogger(RecognizePiiEntityAsyncClient.class);
private final TextAnalyticsClientImpl service;

/**
* Create a {@link RecognizePiiEntityAsyncClient} that sends requests to the Text Analytics services's
* recognize personally identifiable information entity endpoint.
* recognize Personally Identifiable Information entity endpoint.
*
* @param service The proxy service used to perform REST calls.
*/
Expand Down Expand Up @@ -94,7 +92,7 @@ Mono<PiiEntityCollection> recognizePiiEntities(String document, String language)
/**
* Helper function for calling service with max overloaded parameters.
*
* @param documents The list of documents to recognize personally identifiable information entities for.
* @param documents The list of documents to recognize Personally Identifiable Information entities for.
* @param options The {@link TextAnalyticsRequestOptions} request options.
*
* @return A mono {@link Response} that contains {@link RecognizePiiEntitiesResultCollection}.
Expand All @@ -112,7 +110,7 @@ Mono<Response<RecognizePiiEntitiesResultCollection>> recognizePiiEntitiesBatch(
/**
* Helper function for calling service with max overloaded parameters with {@link Context} is given.
*
* @param documents The list of documents to recognize personally identifiable information entities for.
* @param documents The list of documents to recognize Personally Identifiable Information entities for.
* @param options The {@link TextAnalyticsRequestOptions} request options.
* @param context Additional context that is passed through the Http pipeline during the service call.
*
Expand Down Expand Up @@ -141,40 +139,32 @@ private Response<RecognizePiiEntitiesResultCollection> toRecognizePiiEntitiesRes
final EntitiesResult entitiesResult = response.getValue();
// List of documents results
final List<RecognizePiiEntitiesResult> recognizeEntitiesResults = new ArrayList<>();
entitiesResult.getDocuments().forEach(documentEntities ->
entitiesResult.getDocuments().forEach(documentEntities -> {
// Pii entities list
final List<PiiEntity> piiEntities = documentEntities.getEntities().stream().map(entity ->
new PiiEntity(entity.getText(), EntityCategory.fromString(entity.getCategory()),
entity.getSubcategory(), entity.getOffset(), entity.getLength(),
entity.getConfidenceScore()))
.collect(Collectors.toList());
// Warnings
final List<TextAnalyticsWarning> warnings = documentEntities.getWarnings().stream()
.map(warning -> {
final WarningCodeValue warningCodeValue = warning.getCode();
return new TextAnalyticsWarning(
WarningCode.fromString(warningCodeValue == null ? null : warningCodeValue.toString()),
warning.getMessage());
}).collect(Collectors.toList());

recognizeEntitiesResults.add(new RecognizePiiEntitiesResult(
documentEntities.getId(),
documentEntities.getStatistics() == null ? null
: toTextDocumentStatistics(documentEntities.getStatistics()),
null,
new PiiEntityCollection(
new IterableStream<>(documentEntities.getEntities().stream().map(entity ->
new PiiEntity(entity.getText(), EntityCategory.fromString(entity.getCategory()),
entity.getSubcategory(), entity.getOffset(), entity.getLength(),
entity.getConfidenceScore()))
.collect(Collectors.toList())),
new IterableStream<>(documentEntities.getWarnings().stream()
.map(warning -> {
final WarningCodeValue warningCodeValue = warning.getCode();
return new TextAnalyticsWarning(
WarningCode.fromString(warningCodeValue == null ? null : warningCodeValue.toString()),
warning.getMessage());
}).collect(Collectors.toList())))
)));
new PiiEntityCollection(new IterableStream<>(piiEntities), new IterableStream<>(warnings))
));
});
// Document errors
entitiesResult.getErrors().forEach(documentError -> {
/*
* TODO: Remove this after service update to throw exception.
* Currently, service sets max limit of document size to 5, if the input documents size > 5, it will
* have an id = "", empty id. In the future, they will remove this and throw HttpResponseException.
*/
if (documentError.getId().isEmpty()) {
throw logger.logExceptionAsError(
new HttpResponseException(documentError.getError().getInnererror().getMessage(),
getEmptyErrorIdHttpResponse(new SimpleResponse<>(response, response.getValue())),
documentError.getError().getInnererror().getCode()));
}

recognizeEntitiesResults.add(
new RecognizePiiEntitiesResult(documentError.getId(), null,
toTextAnalyticsError(documentError.getError()), null));
Expand All @@ -189,7 +179,7 @@ private Response<RecognizePiiEntitiesResultCollection> toRecognizePiiEntitiesRes
* Call the service with REST response, convert to a {@link Mono} of {@link Response} that contains
* {@link RecognizePiiEntitiesResultCollection} from a {@link SimpleResponse} of {@link EntitiesResult}.
*
* @param documents The list of documents to recognize personally identifiable information entities for.
* @param documents The list of documents to recognize Personally Identifiable Information entities for.
* @param options The {@link TextAnalyticsRequestOptions} request options.
* @param context Additional context that is passed through the Http pipeline during the service call.
*
Expand All @@ -201,14 +191,14 @@ private Mono<Response<RecognizePiiEntitiesResultCollection>> getRecognizePiiEnti
new MultiLanguageBatchInput().setDocuments(toMultiLanguageInput(documents)),
options == null ? null : options.getModelVersion(),
options == null ? null : options.isIncludeStatistics(),
options == null ? null : options.getDomain(),
null,
context.addData(AZ_TRACING_NAMESPACE_KEY, COGNITIVE_TRACING_NAMESPACE_VALUE))
.doOnSubscribe(ignoredValue -> logger.info("A batch of documents - {}", documents.toString()))
.doOnSuccess(response ->
logger.info("Recognized personally identifiable information entities for a batch of documents- {}",
logger.info("Recognized Personally Identifiable Information entities for a batch of documents- {}",
response.getValue()))
.doOnError(error ->
logger.warning("Failed to recognize personally identifiable information entities - {}", error))
logger.warning("Failed to recognize Personally Identifiable Information entities - {}", error))
.map(this::toRecognizePiiEntitiesResultCollectionResponse)
.onErrorMap(throwable -> mapToHttpResponseExceptionIfExist(throwable));
}
Expand Down
Loading