Skip to content

Conversation

pamelafox
Copy link
Collaborator

Purpose

A developer ingested markdown files that used :::image inline images, and Prompty attempted to extract them and treat them as actual images. This PR escapes them, by turning the colons into HTML entities, which seems to avoid the Prompty auto extraction.
Also added a test based off their file.

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[ ] Yes
[X] No

Does this require changes to learn.microsoft.com docs?

This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.

[ ] Yes
[X] No

Type of change

[X] Bugfix
[ ] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff and black manually on my code.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue where DocFX-style ::: image markdown directives were being incorrectly processed by Prompty's image extraction functionality. The fix sanitizes these directives by escaping triple colons to HTML entities.

  • Sanitizes DocFX image directives by converting triple colons to HTML entities (:::)
  • Refactors text cleaning logic to handle both newline normalization and directive escaping
  • Adds comprehensive test coverage for the sanitization functionality

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
app/backend/approaches/approach.py Implements sanitization logic to escape triple colons in source content
tests/test_chatapproach.py Adds test to verify DocFX image directive sanitization works correctly

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@pamelafox pamelafox requested a review from Copilot October 7, 2025 01:40
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +376 to +378
s = s.replace("\n", " ").replace("\r", " ") # normalize newlines to spaces
s = s.replace(":::", ":::") # escape DocFX/markdown triple colons
return s
Copy link
Preview

Copilot AI Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The function modifies and reassigns the same variable 's' multiple times. Consider using a more functional approach by chaining the operations or using intermediate variables for better readability.

Suggested change
s = s.replace("\n", " ").replace("\r", " ") # normalize newlines to spaces
s = s.replace(":::", ":::") # escape DocFX/markdown triple colons
return s
normalized = s.replace("\n", " ").replace("\r", " ") # normalize newlines to spaces
escaped = normalized.replace(":::", ":::") # escape DocFX/markdown triple colons
return escaped

Copilot uses AI. Check for mistakes.

Comment on lines +264 to +276
image_directive = (
"activator-introduction.md#page=1: Intro text before image. "
':::image type="content" source="./media/activator-introduction/activator.png" '
'alt-text="Diagram that shows the architecture of Fabric Activator."::: More text after image.'
)

async def build_sources():
return await chat_approach.get_sources_content(
[
Document(
id="doc1",
content=image_directive.split(": ", 1)[1],
sourcepage="activator-introduction.md#page=1",
Copy link
Preview

Copilot AI Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The test string contains a magic value that mixes citation format with content. Consider extracting the citation prefix and content into separate variables to make the test structure clearer and more maintainable.

Suggested change
image_directive = (
"activator-introduction.md#page=1: Intro text before image. "
':::image type="content" source="./media/activator-introduction/activator.png" '
'alt-text="Diagram that shows the architecture of Fabric Activator."::: More text after image.'
)
async def build_sources():
return await chat_approach.get_sources_content(
[
Document(
id="doc1",
content=image_directive.split(": ", 1)[1],
sourcepage="activator-introduction.md#page=1",
citation_prefix = "activator-introduction.md#page=1"
image_content = (
"Intro text before image. "
':::image type="content" source="./media/activator-introduction/activator.png" '
'alt-text="Diagram that shows the architecture of Fabric Activator."::: More text after image.'
)
image_directive = f"{citation_prefix}: {image_content}"
async def build_sources():
return await chat_approach.get_sources_content(
[
Document(
id="doc1",
content=image_content,
sourcepage=citation_prefix,

Copilot uses AI. Check for mistakes.

@pamelafox pamelafox merged commit efe3c77 into Azure-Samples:main Oct 7, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants