Proposal to Implement Markdown-to-HTML Conversion Tool for LangChain #27132
Minjun1Kim
announced in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked
Feature request
We propose adding a Markdown-to-HTML conversion tool to Lang Chain. This tool will allow users to convert Markdown documents into HTML format, which can be beneficial for users working with web content, reports, and online documentation. The tool will integrate the following features:
• Headings, lists, code blocks, links, images, and tables
I will implement this Markdown-to-HTML conversion tool in the
langchain_community.document_loaders module
, as it aligns with existing loaders that handle document types such as Markdown and HTML. Specifically, I plan to create a new class,MarkdownToHtmlLoader
, following the structure ofUnstructuredMarkdownLoader
. The class will be located within thelangchain_community.document_loaders package
. The logic for the conversion will utilize popular Markdown-to-HTML conversion libraries, ensuring it supports various Markdown features like tables, headings, and lists.The tool will be designed to seamlessly integrate with LangChain's existing ecosystem, allowing users to load Markdown files, convert them to HTML, and use them in their applications with minimal effort. I will also ensure that the new loader follows LangChain’s established practices for loading and processing documents.
Motivation
After reviewing the LangChain codebase, I noticed there isn’t a dedicated tool to convert Markdown to HTML. While there is an existing UnstructuredMarkdownLoader for loading Markdown and a ToMarkdownLoader for converting HTML to Markdown, the reverse process (Markdown to HTML) is missing. Having this conversion tool would be extremely beneficial for users who frequently write or store notes and documentation in Markdown.
I personally take class notes, meeting notes, and write documentation directly in raw Markdown, without relying on third-party software like Notion. I often find myself needing to publish these notes on websites or other online platforms that require HTML formatting. Manually converting Markdown to HTML or using external tools adds unnecessary steps to this process, which many users like myself would prefer to avoid.
By offering a built-in Markdown-to-HTML conversion tool, LangChain could streamline this workflow, making it easier for users to generate and publish Markdown content directly to web platforms. This feature would be particularly beneficial for users who manage Markdown files for blogs, documentation sites, reports, or other online content generation, ensuring a smoother transition from Markdown to web-ready HTML.
Proposal (If applicable)
We are a group of students from the University of Toronto, and we’re excited to contribute to LangChain. Our plan is to submit a proposal in the upcoming weeks and develop and submit a pull request for this feature by mid-November. We would be glad to discuss the specifics of how this tool could be integrated into LangChain’s existing document loaders.
Looking forward to your feedback!
Beta Was this translation helpful? Give feedback.
All reactions