Merge pull request #5090 from bojunehsu/paulhsu/CU-faceSample

JamesJBarnett · web-flow · commit b887725888f0 · 2025-05-21T16:19:19.000-07:00
Content Understanding - Update face sample url
diff --git a/articles/ai-services/content-understanding/document/elements.md b/articles/ai-services/content-understanding/document/elements.md
@@ -51,14 +51,14 @@ Content Understanding generates richly formatted markdown that preserves the ori
 
 #### Words
 
-A `word` is a content element composed of a sequence of characters. Content Understanding uses word boundaries defined by [Unicode Standard Annex #29](https://www.unicode.org/reports/tr29/#Word_Boundaries). For Latin languages, words may be split from punctuation even without intervening spaces. In some language, such as Chinese, supplemental word dictionaries are used to enable word breaking at semantic boundaries. For more information, *see* [Boundary Analysis](https://unicode-org.github.io/icu/userguide/boundaryanalysis/).
+A `word` is a content element composed of a sequence of characters. Content Understanding uses word boundaries defined by [Unicode Standard Annex #29](https://www.unicode.org/reports/tr29/#Word_Boundaries). For Latin languages, words might be split from punctuation even without intervening spaces. In some language, such as Chinese, supplemental word dictionaries are used to enable word breaking at semantic boundaries. For more information, *see* [Boundary Analysis](https://unicode-org.github.io/icu/userguide/boundaryanalysis/).
 
 
 :::image type="content" source="../media/document/word-boundaries.png" alt-text="Screenshot of detected words.":::
 
 #### Selection marks
 
-A `selection mark` is a content element that represents a visual glyph indicating the state of a selection. They may be represented as check boxes, check marks, radio buttons, etc. The state of a selection mark can be selected or unselected, with different visual representation to indicate the state. They're encoded as words in the document analysis result using `☒` (selected) and `☐` (unselected).
+A `selection mark` is a content element that represents a visual glyph indicating the state of a selection. They might appear in the document as check boxes, check marks, radio buttons, etc. The state of a selection mark can be selected or unselected, with different visual representation to indicate the state. They're encoded as words in the document analysis result using `☒` (selected) and `☐` (unselected).
 
 Content Understanding detects check marks inside table cell as selection marks in the selected state. However, it doesn't detect empty table cells as selection marks in the unselected state.
 
@@ -85,7 +85,7 @@ A `barcode` is a content element that describes both linear (ex. UPC, EAN) and 2
 
 #### Formulas
 
-A `formula` is a content element representing mathematical expressions in the document. It may be an `inline` formula embedded with other text, or an `display` formula that takes up an entire line. Multiline formulas are represented as multiple `display` formula elements grouped into `paragraphs` to preserve mathematical relationships.
+A `formula` is a content element representing mathematical expressions in the document. It might be an `inline` formula embedded with other text, or an `display` formula that takes up an entire line. Multiline formulas are represented as multiple `display` formula elements grouped into `paragraphs` to preserve mathematical relationships.
 
 #### Images
 
@@ -97,22 +97,22 @@ Document layout elements are visual and structural components, such as pages, ta
 
 #### Pages
 
-A `page` is a grouping of content that typically corresponds to one side of a sheet of paper. A rendered page is characterized via `width` and `height` in the specified `unit`. In general, images use pixel while PDFs use inch. The `angle` property describes the overall text angle in degrees for pages that may be rotated.
+A `page` is a grouping of content that typically corresponds to one side of a sheet of paper. A rendered page is characterized via `width` and `height` in the specified `unit`. In general, images use pixel while PDFs use inch. The `angle` property describes the overall text angle in degrees for pages that might be rotated.
 
 > [!NOTE]
 > For spreadsheets like Excel, each sheet is mapped to a page. For presentations, like PowerPoint, each slide is mapped to a page. For file formats like HTML or Word documents, which lack a native page concept without rendering, the entire main content is treated as a single page.
 
 #### Paragraphs
 
-A `paragraph` is an ordered sequence of lines that form a logical unit. Typically, the lines share common alignment and spacing between lines. Paragraphs are often delimited via indentation, added spacing, or bullets/numbering. Some paragraphs may have special functional `role` in the document. Currently supported roles include page header, page footer, page number, title, section heading, footnote, and formula block.
+A `paragraph` is an ordered sequence of lines that form a logical unit. Typically, the lines share common alignment and spacing between lines. Paragraphs are often delimited via indentation, added spacing, or bullets/numbering. Some paragraphs have special functional `role` in the document. Currently supported roles include page header, page footer, page number, title, section heading, footnote, and formula block.
 
 #### Lines
 
 A `line` is an ordered sequence of consecutive content elements, often separated by visual spaces. Content elements in the same horizontal plane (row) but separated by more than a single visual space are most often split into multiple lines. While this feature sometimes splits semantically contiguous content into separate lines, it enables the representation of textual content split into multiple columns or cells. Lines in vertical writing are detected in the vertical direction.
 
 #### Tables
 
-A `table` organizes content into a group of cells in a grid layout. The rows and columns may be visually separated by grid lines, color banding, or greater spacing. The position of a table cell is specified via its row and column indices. A cell can span across multiple rows and columns.
+A `table` organizes content into a group of cells in a grid layout. The rows and columns might be visually separated by grid lines, color banding, or greater spacing. The position of a table cell is specified via its row and column indices. A cell can span across multiple rows and columns.
 
 Based on its position and styling, a cell can be classified as general content, row header, column header, stub head, or description:
 
@@ -128,7 +128,7 @@ Based on its position and styling, a cell can be classified as general content,
 
 A table caption specifies content that explains the table. A table can further have a set of footnotes. Unlike a description cell, a caption typically lies outside the grid layout. Table footnotes annotate content inside the table, often marked with footnote symbols. They're often found below the table grid.
 
-A table may span across consecutive pages of a document. In this situation, table continuations in subsequent pages generally maintain the same column count, width, and styling. They often repeat the column headers. Other than page headers, footers, and page numbers, there's generally no intervening content between the initial table and its continuations.
+A table might span across consecutive pages of a document. In this situation, table continuations in subsequent pages generally maintain the same column count, width, and styling. They often repeat the column headers. Other than page headers, footers, and page numbers, there's generally no intervening content between the initial table and its continuations.
 
 > [!NOTE]
 > The span for tables covers only the core content and exclude associated caption and footnotes.
@@ -137,7 +137,7 @@ A table may span across consecutive pages of a document. In this situation, tabl
 
 #### Sections
 
-A `section` is a logical grouping of related content elements that form a hierarchical structure within the document. It often starts with a section heading as the first paragraph. A section may contain subsections, creating a nested document structure that preserves semantic relationships.
+A `section` is a logical grouping of related content elements that form a hierarchical structure within the document. It often starts with a section heading as the first paragraph. A section might contain subsections, creating a nested document structure that preserves semantic relationships.
 
 ### Element properties
 
@@ -149,26 +149,16 @@ The `span` property specifies the logical position of the element in the documen
 
 #### Source
 
-The `source` property describes the visual position of the element in the file using an encoded string. For documents, the source string may be in one of the following formats:
+The `source` property describes the visual position of the element in the file using an encoded string. For documents, the source string can be in one of the following formats:
 * Bounding polygon: `D({pageNumber},{x1},{y1},{x2},{y2},{x3},{y3},{x4},{y4})`
 * Axis-aligned bounding box: `D({pageNumber},{left},{top},{width},{height})`
 
-Page numbers are `1-indexed`. The bounding polygon describes a sequence of points, clockwise from the left relative to the natural orientation of the element. For quadrilaterals, the points represent the top-left, top-right, bottom-right, and bottom-left corners. Each point represents the **x**, **y** coordinate in the length unit specified by the `unit` property. In general, the unit of measure for images is pixels while PDFs use inches.
+Page numbers are 1-indexed. The bounding polygon describes a sequence of points, clockwise from the left relative to the natural orientation of the element. For quadrilaterals, the points represent the top-left, top-right, bottom-right, and bottom-left corners. Each point represents the **x**, **y** coordinate in the length unit specified by the `unit` property. In general, the unit of measure for images is pixels while PDFs use inches.
 
 :::image type="content" source="../media/document/bounding-regions.png" alt-text="Screenshot of detected bounding regions.":::
 
 > [!NOTE]
-> Currently, Content Understanding only returns `4-point` quadrilaterals as bounding polygons. Future versions may return different number of points to describe more complex shapes, such as curved lines or nonrectangular images. Currently, source is only returned for elements from rendered files (pdf/image).
-
-## Supported content and layout elements
-
-Different file formats support different subsets of content and layout elements. The following table lists the currently supported elements for each file type.
-
-|Document type|Supported format|
-|-----|-----|
-|**Portable Document Format**|`.pdf`|
-|**Image**|`.jpeg/.jpg`, `.png`, `.bmp`, `.tiff`, `.heif`|
-|**Microsoft Office**|`.docx`, `.pptx`, `.xls`|
+> Currently, Content Understanding only returns 4-point quadrilaterals as bounding polygons. Future versions might return different number of points to describe more complex shapes, such as curved lines or nonrectangular images. Currently, source is only returned for elements from rendered files (pdf/image).
 
 ## Next steps
 
diff --git a/articles/ai-services/content-understanding/face/overview.md b/articles/ai-services/content-understanding/face/overview.md
@@ -92,4 +92,4 @@ Azure AI Content Understanding adheres to Microsoft's strict policies on custome
 ## Next steps
 
 * Learn how to build a [**person directory**](../tutorial/build-person-directory.md).
-* Review code sample: [**person directory**](https://github.com/Azure-Samples/azure-ai-content-understanding-python/blob/zhizho/face/notebooks/build_person_directory.ipynb).
+* Review code sample: [**person directory**](https://github.com/Azure-Samples/azure-ai-content-understanding-python/blob/main/notebooks/build_person_directory.ipynb).
diff --git a/articles/ai-services/content-understanding/language-region-support.md b/articles/ai-services/content-understanding/language-region-support.md
@@ -27,6 +27,10 @@ To use Azure AI Content Understanding, create your Azure AI Service resource in
 
 † Australia East doesn't support data zone as a processing location.
 
+> [!NOTE]
+>
+> [Pro mode](concepts/standard-pro-modes.md) currently only supports data zone and global as processing location.
+
 ## Language support
 
 Azure AI Content Understanding enables you to process data in multiple languages simultaneously. Our language support capabilities enable users to communicate with your applications in natural ways and empower global outreach.
diff --git a/articles/ai-services/content-understanding/video/overview.md b/articles/ai-services/content-understanding/video/overview.md
@@ -54,7 +54,7 @@ With the prebuilt video analyzer (prebuilt-videoAnalyzer), you can upload a vide
 
 * For example, creating the base `prebuilt-videoAnalyzer` as follows:
 
-  ```jsonc
+  ```json
   {
     "config": {},
     "BaseAnalyzerId": "prebuilt-videoAnalyzer",
@@ -154,8 +154,7 @@ Shape the output to match your business vocabulary. Use a `fieldSchema` object w
 
 **Example:**
 
-```jsonc
-
+```json
 "fieldSchema": {
   "description": "Extract brand presence and sentiment per scene",
   "fields": {
@@ -207,7 +206,7 @@ Content Understanding offers three ways to slice a video, letting you get the ou
   **Example:**
     * Break a news broadcast up into stories.
 
-    ```jsonc
+    ```json
     {
       "segmentationMode": "custom",
       "segmentationDefinition": "news broadcasts divided by individual stories"