Skip to content

Commit 5572d65

Browse files
Merge pull request #34 from Azure-Samples/chienyuanchang/classify_samples
Add sample for classifier
2 parents fb251ac + 22f29e4 commit 5572d65

10 files changed

+792
-11
lines changed

README.md

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ Once you click the link above, please follow the steps below to set up the Codes
7171
## Configure Azure AI service resource
7272
### (Option 1) Use `azd` commands to auto create temporal resources to run sample
7373
1. Make sure you have permission to grant roles under subscription
74-
1. Login Azure
74+
2. Login Azure
7575
```shell
7676
azd auth login
7777
```
@@ -80,12 +80,11 @@ Once you click the link above, please follow the steps below to set up the Codes
8080
azd auth login --use-device-code
8181
```
8282
83-
1. Setting up environment, following prompts to choose location
83+
3. Setting up environment, following prompts to choose location
8484
```shell
8585
azd up
8686
```
8787
88-
8988
### (Option 2) Manually create resources and set environment variables
9089
1. Create [Azure AI Services resource](docs/create_azure_ai_service.md)
9190
2. Go to `Access Control (IAM)` in resource, grant yourself role `Cognitive Services User`
@@ -97,6 +96,21 @@ Once you click the link above, please follow the steps below to set up the Codes
9796
azd auth login
9897
```
9998

99+
### (Option 3) Use Endpoint and Key (No `azd` Required)
100+
> ⚠️ Note: Using a subscription key works, but using a token provider with Azure Active Directory (AAD) is much safer and is highly recommended for production environments.
101+
1. Create [Azure AI Services resource](docs/create_azure_ai_service.md)
102+
2. Copy `notebooks/.env.sample` to `notebooks/.env`
103+
```bash
104+
cp notebooks/.env.sample notebooks/.env
105+
```
106+
3. Update `.env` with your credentials
107+
- Edit notebooks/.env and set the following values:
108+
```
109+
AZURE_AI_ENDPOINT=https://<your-resource-name>.services.ai.azure.com/
110+
AZURE_AI_API_KEY=<your-azure-ai-api-key>
111+
```
112+
- Replace <your-resource-name> and <your-azure-ai-api-key> with your actual values. You can find them in your AI Services resource under `Resource Management`/`Keys and Endpoint`
113+
100114
## Open a Jupyter notebook and follow the step-by-step guidance
101115

102116
Navigate to the `notebooks` directory and select the sample notebook you are interested in. Since the Dev Container (in Codespaces or in your local enviornment) is pre-configured with the necessary environment, you can directly execute each step in the notebook.
@@ -119,11 +133,12 @@ Azure AI Content Understanding is a new Generative AI-based [Azure AI service](h
119133
| File | Description |
120134
| --- | --- |
121135
| [content_extraction.ipynb](notebooks/content_extraction.ipynb) | In this sample we will show content understanding API can help you get semantic information from your file. For example OCR with table in document, audio transcription, and face analysis in video. |
122-
| [field_extraction.ipynb](notebooks/field_extraction.ipynb) | In this sample we will show how to create an analyzer to extract fields in your file. For example invoice amount in the document, how many people in an image, names mentioned in an audio, or summary of a video. You can customize the fields by creating your own analyzer template. |
136+
| [field_extraction.ipynb](notebooks/field_extraction.ipynb) | In this sample we will show how to create an analyzer to extract fields in your file. For example invoice amount in the document, how many people in an image, names mentioned in an audio, or summary of a video. You can customize the fields by creating your own analyzer template. |
137+
| [classifier.ipynb](notebooks/classifier.ipynb) | This sample will demo how to (1) create a classifier to categorize documents, (2) create a custom analyzer to extract specific fields, and (3) combine classifier and analyzers to classify, optionally split, and analyze documents in a flexible processing pipeline. |
123138
| [conversational_field_extraction.ipynb](notebooks/conversational_field_extraction.ipynb) | This sample shows you how to evaluate conversational audio data that has previously been transcribed with Content Understanding or Azure AI Speech in in an efficient way to optimize processing quality. This also allows you to re-analyze data in a cost-efficient way. This sample is based on the [field_extraction.ipynb](notebooks/field_extraction.ipynb) sample. |
124139
| [analyzer_training.ipynb](notebooks/analyzer_training.ipynb) | If you want to futher boost the performance for field extraction, we can do training when you provide few labeled samples to the API. Note: This feature is available to document scenario now. |
125140
| [management.ipynb](notebooks/management.ipynb) | This sample will demo how to create a minimal analyzer, list all the analyzers in your resource, and delete the analyzer you don't need. |
126-
| [build_person_directory.ipynb](notebooks/build_person_directory.ipynb) | This sample will demo how to enroll people’s faces from images and build a Person Directory. | |
141+
| [build_person_directory.ipynb](notebooks/build_person_directory.ipynb) | This sample will demo how to enroll people’s faces from images and build a Person Directory. |
127142
128143
## More Samples using Azure Content Understanding
129144
[Azure Search with Content Understanding](https://github.com/Azure-Samples/azure-ai-search-with-content-understanding-python)

data/mixed_financial_docs.pdf

131 Bytes
Binary file not shown.

notebooks/analyzer_training.ipynb

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,14 @@
6060
"metadata": {},
6161
"source": [
6262
"## Create Azure content understanding client\n",
63-
">The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is utility Class which contain the functions to interact with the Content Understanding server. Before Content Understanding SDK release, we can regard it as a lightweight SDK. Fill the constant **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, **AZURE_AI_API_KEY** with the information from your Azure AI Service."
63+
"> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is utility Class which contain the functions to interact with the Content Understanding server. Before Content Understanding SDK release, we can regard it as a lightweight SDK. Fill the constant **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, **AZURE_AI_API_KEY** with the information from your Azure AI Service.\n",
64+
"\n",
65+
"> ⚠️ Important:\n",
66+
"You must update the code below to match your Azure authentication method.\n",
67+
"Look for the `# IMPORTANT` comments and modify those sections accordingly.\n",
68+
"If you skip this step, the sample may not run correctly.\n",
69+
"\n",
70+
"> ⚠️ Note: Using a subscription key works, but using a token provider with Azure Active Directory (AAD) is much safer and is highly recommended for production environments."
6471
]
6572
},
6673
{
@@ -91,7 +98,10 @@
9198
"client = AzureContentUnderstandingClient(\n",
9299
" endpoint=os.getenv(\"AZURE_AI_ENDPOINT\"),\n",
93100
" api_version=os.getenv(\"AZURE_AI_API_VERSION\", \"2025-05-01-preview\"),\n",
101+
" # IMPORTANT: Comment out token_provider if using subscription key\n",
94102
" token_provider=token_provider,\n",
103+
" # IMPORTANT: Uncomment this if using subscription key\n",
104+
" # subscription_key=os.getenv(\"AZURE_AI_API_KEY\"),\n",
95105
" x_ms_useragent=\"azure-ai-content-understanding-python/analyzer_training\", # This header is used for sample usage telemetry, please comment out this line if you want to opt out.\n",
96106
")"
97107
]

notebooks/build_person_directory.ipynb

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,14 @@
2828
"metadata": {},
2929
"source": [
3030
"## Create Azure content understanding face client\n",
31-
">The [AzureContentUnderstandingFaceClient](../python/content_understanding_face_client.py) is a utility class for interacting with the Content Understanding Face service. Before the official SDK is released, this acts as a lightweight SDK. Set the constants **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, and **AZURE_AI_API_KEY** with your Azure AI Service information."
31+
"> The [AzureContentUnderstandingFaceClient](../python/content_understanding_face_client.py) is a utility class for interacting with the Content Understanding Face service. Before the official SDK is released, this acts as a lightweight SDK. Set the constants **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, and **AZURE_AI_API_KEY** with your Azure AI Service information.\n",
32+
"\n",
33+
"> ⚠️ Important:\n",
34+
"You must update the code below to match your Azure authentication method.\n",
35+
"Look for the `# IMPORTANT` comments and modify those sections accordingly.\n",
36+
"If you skip this step, the sample may not run correctly.\n",
37+
"\n",
38+
"> ⚠️ Note: Using a subscription key works, but using a token provider with Azure Active Directory (AAD) is much safer and is highly recommended for production environments."
3239
]
3340
},
3441
{
@@ -59,7 +66,10 @@
5966
"client = AzureContentUnderstandingFaceClient(\n",
6067
" endpoint=os.getenv(\"AZURE_AI_ENDPOINT\"),\n",
6168
" api_version=os.getenv(\"AZURE_AI_API_VERSION\", \"2025-05-01-preview\"),\n",
69+
" # IMPORTANT: Comment out token_provider if using subscription key\n",
6270
" token_provider=token_provider,\n",
71+
" # IMPORTANT: Uncomment this if using subscription key\n",
72+
" # subscription_key=os.getenv(\"AZURE_AI_API_KEY\"),\n",
6373
" x_ms_useragent=\"azure-ai-content-understanding-python/build_person_directory\", # This header is used for sample usage telemetry, please comment out this line if you want to opt out.\n",
6474
")"
6575
]

0 commit comments

Comments
 (0)