From 78284f5805d06054d3c77d21a4d06e977224c230 Mon Sep 17 00:00:00 2001 From: Chien Yuan Chang Date: Tue, 29 Jul 2025 11:19:23 -0700 Subject: [PATCH] docs: review notebooks/conversational_field_extraction.ipynb --- .../conversational_field_extraction.ipynb | 40 ++++++++++--------- 1 file changed, 21 insertions(+), 19 deletions(-) diff --git a/notebooks/conversational_field_extraction.ipynb b/notebooks/conversational_field_extraction.ipynb index a0880e4..a426b0a 100644 --- a/notebooks/conversational_field_extraction.ipynb +++ b/notebooks/conversational_field_extraction.ipynb @@ -4,14 +4,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Extract Custom Fields from Your Pretranscribed File" + "# Extract Custom Fields from Your Pre-transcribed File" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "This notebook demonstrates how to use analyzers to extract custom fields from your transcription input files." + "This notebook demonstrates how to use analyzers to extract custom fields from your pre-transcribed input files." ] }, { @@ -19,7 +19,7 @@ "metadata": {}, "source": [ "## Prerequisites\n", - "1. Ensure Azure AI service is configured following [steps](../README.md#configure-azure-ai-service-resource)\n", + "1. Ensure your Azure AI service is configured by following the [configuration steps](../README.md#configure-azure-ai-service-resource).\n", "2. Install the required packages to run the sample." ] }, @@ -45,7 +45,7 @@ "source": [ "Below is a collection of analyzer templates designed to extract fields from various input file types.\n", "\n", - "These templates are highly customizable, allowing you to modify them to suit your specific needs. For additional verified templates from Microsoft, please visit [here](../analyzer_templates/README.md)." + "These templates are highly customizable, allowing you to adapt them to your specific requirements. For additional verified templates provided by Microsoft, please visit [here](../analyzer_templates/README.md)." ] }, { @@ -65,7 +65,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Specify the analyzer template you want to use and provide a name for the analyzer to be created based on the template." + "Specify the analyzer template to use and assign a unique name for the analyzer that will be created from the template." ] }, { @@ -88,14 +88,16 @@ "source": [ "## Create Azure AI Content Understanding Client\n", "\n", - "> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class containing functions to interact with the Content Understanding API. Before the official release of the Content Understanding SDK, it can be regarded as a lightweight SDK. Fill the constant **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, **AZURE_AI_API_KEY** with the information from your Azure AI Service.\n", + "> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class providing functions to interact with the Content Understanding API. Before the official release of the Content Understanding SDK, this class can be considered a lightweight SDK.\n", + "\n", + "> Fill in the constants **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, and **AZURE_AI_API_KEY** with your Azure AI Service credentials.\n", "\n", "> ⚠️ Important:\n", - "You must update the code below to match your Azure authentication method.\n", + "Make sure to update the code below to match your chosen Azure authentication method.\n", "Look for the `# IMPORTANT` comments and modify those sections accordingly.\n", - "If you skip this step, the sample may not run correctly.\n", + "Skipping this step may prevent the sample from running correctly.\n", "\n", - "> ⚠️ Note: Using a subscription key works, but using a token provider with Azure Active Directory (AAD) is much safer and is highly recommended for production environments." + "> ⚠️ Note: While subscription key authentication works, it is strongly recommended to use a token provider with Azure Active Directory (AAD) for improved security in production environments." ] }, { @@ -115,13 +117,13 @@ "load_dotenv(find_dotenv())\n", "logging.basicConfig(level=logging.INFO)\n", "\n", - "# For authentication, you can use either token-based auth or subscription key, and only one of them is required\n", + "# For authentication, you may use either token-based auth or a subscription key; only one is required.\n", "AZURE_AI_ENDPOINT = os.getenv(\"AZURE_AI_ENDPOINT\")\n", - "# IMPORTANT: Replace with your actual subscription key or set up in \".env\" file if not using token auth\n", + "# IMPORTANT: Replace with your actual subscription key or configure it in the \".env\" file if not using token authentication.\n", "AZURE_AI_API_KEY = os.getenv(\"AZURE_AI_API_KEY\")\n", "AZURE_AI_API_VERSION = os.getenv(\"AZURE_AI_API_VERSION\", \"2025-05-01-preview\")\n", "\n", - "# Add the parent directory to the path to use shared modules\n", + "# Add the parent directory to the system path to access shared modules\n", "parent_dir = Path(Path.cwd()).parent\n", "sys.path.append(str(parent_dir))\n", "from python.content_understanding_client import AzureContentUnderstandingClient\n", @@ -134,9 +136,9 @@ " api_version=AZURE_AI_API_VERSION,\n", " # IMPORTANT: Comment out token_provider if using subscription key\n", " token_provider=token_provider,\n", - " # IMPORTANT: Uncomment this if using subscription key\n", + " # IMPORTANT: Uncomment the following line if using subscription key\n", " # subscription_key=AZURE_AI_API_KEY,\n", - " # x_ms_useragent=\"azure-ai-content-understanding-python/field_extraction\", # This header is used for sample usage telemetry, please comment out this line if you want to opt out.\n", + " # x_ms_useragent=\"azure-ai-content-understanding-python/field_extraction\", # This header is used for sample usage telemetry. Comment out if you want to opt out.\n", ")" ] }, @@ -170,7 +172,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "After the analyzer is successfully created, we can use it to analyze our input files." + "Once the analyzer is successfully created, you can use it to analyze your input files." ] }, { @@ -181,14 +183,14 @@ "source": [ "from python.extension.transcripts_processor import TranscriptsProcessor\n", "\n", - "test_file_path=analyzer_sample_file_path\n", + "test_file_path = analyzer_sample_file_path\n", "\n", "transcripts_processor = TranscriptsProcessor()\n", "webvtt_output, webvtt_output_file_path = transcripts_processor.convert_file(test_file_path)\n", "\n", "if \"WEBVTT\" not in webvtt_output:\n", " print(\"Error: The output is not in WebVTT format.\")\n", - "else: \n", + "else:\n", " response = client.begin_analyze(CUSTOM_ANALYZER_ID, file_location=webvtt_output_file_path)\n", " print(\"Response:\", response)\n", " result_json = client.poll_result(response)\n", @@ -201,7 +203,7 @@ "metadata": {}, "source": [ "## Clean Up\n", - "Optionally, delete the sample analyzer from your resource. In typical usage scenarios, you would analyze multiple files using the same analyzer." + "Optionally, delete the sample analyzer from your Azure resource. In typical usage scenarios, you would analyze multiple files using the same analyzer." ] }, { @@ -235,4 +237,4 @@ }, "nbformat": 4, "nbformat_minor": 2 -} +} \ No newline at end of file