Skip to content

Attempted conversion of markdown shows HTML-style comments and links get broken #96

@wnm3

Description

@wnm3

I'm attaching a markdown file that I converted by reading the lines and attempting to call the reshaper method to switch from ltr -> rtl. Lines like below get broken:

<!-- <map "id="FPMap2" "> -->
<!-- </map "id="FPMap2" "> -->

![logos.jpg](https://www.rewity.com/forum/rewity/images/logos.jpg)

become:

<!-- <map "id="FPMap2" "> --
<!-- </map "id="FPMap2" "> --

![logos.jpg](https://www.rewity.com/forum/rewity/images/logos.jpg

It also didn't reverse the order of the table cells (fenced using pipe characters) as shown below on the left with the result using arabic-reshaper and simply using

and
to surround the ltr markdown content on the right:
image

younes.md

Code used to create the output:

import arabic_reshaper
from RAG_Data_Pipeline.utility.DPUtils import DPUtils

lines = DPUtils.loadTextFile(
    "./tests/resources/ragdatapipeline/markdowngenerator/arabic/younes.md"
)
rtl_lines = ""
for line in lines:
    rtl_line = arabic_reshaper.reshape(line[: len(line) - 1])  # remove newline
    rtl_lines += rtl_line + "\n"
DPUtils.saveTextFile(
    "./tests/resources/ragdatapipeline/markdowngenerator/arabic/rtl_younes.md",
    rtl_lines,
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions