Skip to content

feat(pdf): improve pdf markdown rendering#426

Merged
mudler merged 1 commit intomainfrom
feat/pdf
Feb 22, 2026
Merged

feat(pdf): improve pdf markdown rendering#426
mudler merged 1 commit intomainfrom
feat/pdf

Conversation

@mudler
Copy link
Owner

@mudler mudler commented Feb 22, 2026

This PR introduces a parser from markdown to the pdf elements so the PDF action can finally generate readable PDFs.

This pull request introduces Markdown support for PDF generation in the GenPDFAction, allowing structured content such as headings, lists, code blocks, and tables to be rendered in the output PDF. The action now parses and renders Markdown content, with comprehensive tests added to verify correct rendering and handling of special characters.

Markdown rendering enhancements

  • Added Markdown parsing using gomarkdown/markdown libraries in genpdf.go, enabling structured content rendering in PDFs.
  • Updated the PDF generation logic to parse content as Markdown and render it appropriately, falling back to plain text if parsing fails.
  • Improved the action definition to clarify that Markdown content (headings, bold, lists, code blocks, etc.) is supported and rendered in the PDF.

Test coverage improvements

  • Added tests in genpdf_test.go to verify PDF generation with Markdown content, special characters, and tables, ensuring correct rendering and file output.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Copilot AI review requested due to automatic review settings February 22, 2026 21:42
@mudler mudler merged commit 61a89aa into main Feb 22, 2026
3 checks passed
@mudler mudler deleted the feat/pdf branch February 22, 2026 21:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Markdown-to-PDF rendering support to GenPDFAction so generated PDFs can represent structured content (headings, lists, code blocks, tables) instead of plain text.

Changes:

  • Parse content as Markdown using gomarkdown/markdown and render the AST into gofpdf.
  • Introduce a dedicated Markdown renderer (genpdf_markdown.go) with table and inline formatting support.
  • Add tests intended to cover Markdown content, special characters, and tables for PDF generation.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
services/actions/genpdf.go Parses content as Markdown and routes rendering through the new Markdown-to-PDF renderer; updates action description.
services/actions/genpdf_markdown.go New Markdown AST walker/renderer for gofpdf (headings, lists, code blocks, tables, inline styles).
services/actions/genpdf_test.go Adds new test cases for Markdown content, special characters, and tables.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +380 to +407
left, _, _, _ := pdf.GetMargins()
words := strings.Fields(s)
for i, word := range words {
wordW := pdf.GetStringWidth(word)
spaceW := 0.0
if i > 0 {
spaceW = pdf.GetStringWidth(" ")
}
x := pdf.GetX()
// If this word (and preceding space) would overflow, start a new line first.
if i > 0 {
if x+spaceW+wordW > maxW && x > left {
pdf.Ln(lineHt)
x = pdf.GetX()
} else {
pdf.CellFormat(spaceW, lineHt, " ", "", 0, "", false, 0, "")
x = pdf.GetX()
}
} else if wordW > 0 && x+wordW > maxW && x > left {
pdf.Ln(lineHt)
x = pdf.GetX()
}
// Single word longer than line width: use MultiCell so it wraps.
if wordW > maxW-left {
pdf.MultiCell(0, lineHt, word, "", "", false)
} else {
if x+wordW > maxW && x > left {
pdf.Ln(lineHt)
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cellWrap treats maxW as if it were an absolute X coordinate (it compares pdf.GetX()+... > maxW), but maxW is computed as a width (pageW - left - right). This makes wrapping occur too early (effectively subtracting the left margin twice) and can also mis-handle the long-word branch (wordW > maxW-left). Consider passing an absolute right boundary (e.g., maxX := left + maxW) or computing available width from the current X position when wrapping.

Copilot uses AI. Check for mistakes.
pdf.Ln(pdfBlockMargin)
case *ast.HorizontalRule:
pdf.Ln(pdfBlockMargin)
pdf.Line(pdf.GetX(), pdf.GetY(), pdf.GetX()+190, pdf.GetY())
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The horizontal rule is drawn with a hard-coded width (+190), which ignores the configured page size and margins and can render past the right margin (or be too short if margins/page size differ). Consider computing the end X using the page width and right margin (or using pdf.GetPageSize() / current margins) instead of a constant.

Suggested change
pdf.Line(pdf.GetX(), pdf.GetY(), pdf.GetX()+190, pdf.GetY())
x := pdf.GetX()
_, _, right, _ := pdf.GetMargins()
pageWidth, _ := pdf.GetPageSize()
endX := pageWidth - right
pdf.Line(x, pdf.GetY(), endX, pdf.GetY())

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +66
pdf.SetFont("Arial", "", 12)
pdf.CellFormat(8, pdfLineHeight, bullet, "", 0, "", false, 0, "")
for inner := ast.GetFirstChild(item); inner != nil; inner = ast.GetNextNode(inner) {
renderBlock(pdf, tr, inner)
}
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List items render the bullet with a fixed-width CellFormat, then rely on cellWrap/pdf.Ln() for wrapping. Since Ln() resets X to the left margin, wrapped lines of a long list item will start at the page left margin rather than aligning under the text after the bullet (and can visually collide with the bullet area). Consider introducing an explicit indent (e.g., increase left margin / set X after Ln) while rendering the list item's content so wrapped lines remain aligned.

Copilot uses AI. Check for mistakes.
Comment on lines +165 to +179
It("generates PDF with markdown content and renders structure", func() {
content := "# Section\n\n**Bold** and *italic* and `code`.\n\n- Item one\n- Item two"
result, err := action.Run(ctx, sharedState, types.ActionParams{
"content": content,
})

Expect(err).ToNot(HaveOccurred())
Expect(result.Result).To(ContainSubstring("PDF generated and saved to:"))
paths := result.Metadata[actions.MetadataPDFs].([]string)
Expect(paths).To(HaveLen(1))
Expect(paths[0]).To(BeAnExistingFile())
info, err := os.Stat(paths[0])
Expect(err).ToNot(HaveOccurred())
Expect(info.Size()).To(BeNumerically(">", 0))
})
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new tests only assert that a non-empty PDF file is produced; they don't verify that Markdown is actually rendered (e.g., headings/lists are not left as raw #, **, backticks) or that special characters round-trip correctly. This means the tests can pass even if rendering regresses. Consider extracting text from the generated PDF (there is already a github.com/dslipak/pdf dependency in go.mod) and asserting expected output/content for each case.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants