pandoc 3.8 #11116
jgm
announced in
Announcements
pandoc 3.8
#11116
Replies: 1 comment
-
Unofficial Linux/RISC-V 64-bit ( |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Click to expand changelog
Add a new input and output format
xml
, exactly representing a Pandoc AST and isomorphic to the existingnative
andjson
formats (massifrg). XML schemas for validation can be found intools/pandoc-xml.*
. The format is documented indoc/xml.md
. Pandoc now defaults to this reader and writer when the.xml
extension is used.Two new exported modules are added [API change]: Text.Pandoc.Readers.XML, exporting
readXML
, and Text.Pandoc.Writers.XML, exportingwriteXML
. A new unexported module Text.Pandoc.XMLFormat is also added.Add a new command line option
--syntax-highlighting
; this takes the valuesnone
,default
,idiomatic
, a style name, or a path to a theme file. It replaces the--no-highlighting
,--highlighting-style
, and--listings
options, which will still work but with a deprecation warning. (Albert Krewinkel)Create directory of output file if it doesn’t exist (Output directory existence #11040).
Update
--version
copyright dates (update copyright note #10961), and use a hardcoded string “pandoc” for the program name in--version
, per GNU guidelines.Add
smart_quotes
andspecial_strings
extensions (Albert Krewinkel). Currently these only affectorg
. Org mode makes a distinction between smart parsing of quotes, and smart parsing of special strings like...
. The finer grained control over these features is necessary to truthfully reproduce Emacs Org mode behavior. Special strings are enabled by default, while smart quotes are disabled.Remove the old
compact_definition_lists
extension. This was neded to preserve backwards compatibility after pandoc 1.12 was released, but at this point we can get rid of it.Make
-t chunkedhtml -o -
output to stdout (as documented), rather than creating a directory called-
(When specifying -o - for chunkedhtml - is treated as a directory instead of piping the zip file to stdout #11068).RST reader: Support multiple header rows (RST: Simple Tables: multiple header rows not supported #10338, TuongNM).
LaTeX reader:
\minisec
as unlisted level 6 headings (Loss of the heading of \minisec{heading} in the Latex -> HTML conversion #10635, Albert Krewinkel).\ifmmode
(ifmmode #10915).align
orequation
. Previously we “downshifted” these, parsing analign
environment as a Math element withaligned
, and anequation
environment as a regular display math element. With this shift, we put these in Math inlines but retain the original environments. texmath and MathJax both handle these environments well.Typst reader:
smallcaps
to be applied to block-level content like headings. This produces a type mismatch in pandoc, so before processing the output of typst-hs, we transform it, pulling the block-level elements outside of the inline-level elements.Org reader:
;
.HTML reader:
pre
element (Eating up first newline in HTML <pre> tag #11064).DocBook reader:
POD reader:
Man reader:
Markdown reader:
@foo [test]{.bar}
. See Pandoc's citeproc gets confused by the presence of footnotes #9080 (comment).four_space_rule
extension is not enabled, figure out the indentation needed for child blocks dynamically, by looking at the first nonspace content after the:
marker. Previously the four-space rule was always obeyed.ODT reader:
table-header-rows
(Tuong Nguyen Manh).Docx reader:
stringToInteger
(docx+styles
reader treats "Heading 1 Lorem ipsum" style as "Heading 1" #9184). It previously converted things like11ccc
to an integer; now it requires that the whole string be parsable as an integer.LaTeX writer:
latex-placement
attribute is present on a figure, it will be used as the optional positioning hint in LaTeX (e.g.ht
). With implicit figures,latex-placement
will be added to the figure (and removed from the image) if it is present on the image.\cancel
,\bcancel
, or\xcancel
.title-meta
(Conversion to PDF/A #10501). This is needed to prevent PDFs from interpreting this as a sequence of titles.pdf-trailer-id
ifSOURCE_DATE_EPOCH
envvar is set (Reproducible Markdown to PDF conversion #6539, Albert Krewinkel). TheSOURCE_DATE_EPOCH
environment variable is used to trigger reproducible PDF compilation, i.e., PDFs that are identical down to the byte level for repeated runs.\url
(Unusual characters in links can lead to "missing character", even if the character has a glyph in the main font #8802). We only use it when the URL is all ASCII, since the\url
macro causes problems when used with some non-ASCII characters.align
) are found in Math elements, we emit them “raw” instead of putting them in$..$
.Typst writer:
XID_Continue
in identifiers (Tuong Nguyen Manh).nocite
(nocite
support for Typst #10680, Albert Krewinkel). Thenocite
metadata field can now be used to supply additional citations that don’t appear in the text, just as with citeproc and LaTeX’s bibtex and natbib.lang
attribute in Divs (Support language-settings on divs in Typst #10965).numbering
variable tosection-numbering
(Albert Krewinkel). This is the name expected by the default template.abstract-title
for Typst format #9724).Org writer:
Markdown writer:
sourceCode
class be removed using-t gfm-raw_html
? #10926). Omit the wrapper sourceCode divs added by pandoc around code blocks. More intelligently identify which class to use for the one class allowed in GFM code blocks. If there is a class of formlanguage-X
, useX
; otherwise use the first class other thansourceCode
.DocBook writer:
startingnumber
instead ofoverride
for start numbers on ordered lists (Docbook reader ignores orderedlist startingnumber attribute #10912).ANSI writer:
--wrap=none
work properly (--wrap=none
does not work withansi
output format #10898).Djot writer:
Docx writer:
HTML writer:
<div>
instead of applying to the existing block element #11014). Some of the readers (e.g. djot) add “wrapper” divs to hold attributes for elements that have no slot for attributes in the pandoc AST. The HTML reader now “unwraps” these wrappers so that the attributes go on the intended elements.Asciidoc writer:
HTML styles template: prefix default styles with informative CSS comment (Albert Krewinkel, Add a comment in the default HTML style template with some instruction on how to disable it and a link to the documentation #8819).
Org template: add
#+options
lines if necessary (Albert Krewinkel). The default template now adds#+options
lines if non-default settings are used for thesmart_quotes
andspecial_strings
extensions.LaTeX template:
linkcolor=
in hypersetup (toccolor needed when Documentclass is declared #11098).Typst template:
thanks
,abstract-title
,linestretch
,mathfont
,codefont
,linkcolor
,filecolor
,citecolor
.reference.docx
:styles.xml
tominorEastAsia
(TomBen).styles.xml
for East Asia to Simplified Chinese (TomBen).Text.Pandoc.PDF:
makePDF
: automatically embed resources from media bag in HTML before trying to convert it with weasyprint, etc. (--embed-resources
is not respected in PDF conversion with WeasyPrint #11099). This will give better results when converting from formats like docx.utf8ToText
for LaTeX log messages.pdflatex-dev
andlualatex-dev
as PDF engines (Allowlualatex-dev
as PDF engine #10991, Albert Krewinkel). These are the development versions of the LaTeX binaries; installable, e.g., withtlmgr install latex-base-dev
.makePDF
(Albert Krewinkel).Text.Pandoc.Readers:
ods
,odp
,odf
,xls
,xslx
,zip
extensions.Text.Pandoc.App:
pandoc-cli/src/pandoc.hs
had similar code for generating version information. To avoid duplication, we now exportversionInfo
from Text.Pandoc.App [API change]. This function has three parameters that can be filled in when it is called bypandoc-cli
.Text.Pandoc.Parsing:
tableWith
andtableWith'
now return a list of lists of Blocks, rather than a list of Blocks, for the header rows, allowing for multiple header rows [API change] (RST: Simple Tables: multiple header rows not supported #10338, TuongNM).Text.Pandoc.Citeproc:
--citeproc
to put the bibliography in a Div with idrefs
even when--file-scope
is used (file-scope
messes with placement of references div#refs
#11072). When--file-scope
is used, a prefix will be added based on the filename, so the Div will end up having an identifier likemyfile.md__refs
. Previously, this prevented the bibliography from being added to the marked Div. Now pandoc will add the bibliography to any Div with the idrefs
or any id ending in__refs
.Text.Pandoc.Citeproc.BibTeX: Protect case in periodical titles (Journal title casing ignore protecting `{..}` #11048). Thus, for example,
{npj} Quantum Information
should translate as[npj]{.nocase} Quantum Information
.Text.Pandoc.ImageSize:
pandoc.image.size
errors out on JPG #11049).pandoc.image.size
fails on AVIF images #10979).Text.Pandoc.Writers.Shared:
lookupMeta...
functions (lookupMetaBlocks allows loss of data without error reporting #10634, Albert Krewinkel).Text.Pandoc.Options:
defaultWebTeXURL
WebTeX URL [API change] (Add default WebTeX URL to Options.hs [API change] #11029, Sean Soon). This fixes thewebtex
option when used without parameter in a defaults file.HighlightMethod
and patterns [API Change] (Albert Krewinkel).writerListings
andwriterHighlightStyle
fields of theWriterOptions
type are replaced withwriterHighlightStyle
[API change] (Albert Krewinkel, Rework syntax highlighting options #10525).Text.Pandoc.Extensions:
Ext_compact_definition_lists
constructor forExtension
[API change].Ext_smart_quotes
andExt_special_strings
constructors. [API change].Text.Pandoc.SelfContained:
pandoc.mediabag.fetch
fails to read paths containing hash symbol (#
) #11021.Text.Pandoc.Highlighting:
defaultStyle
[API Change] (Albert Krewinkel). This allows to be more explicit about using a default style, and providing a single point of truth for its value. The variable is an alias forpygments
.Text.Pandoc.Class:
downloadOrRead
: do not drop fragment/hash for local file paths (pandoc.mediabag.fetch
fails to read paths containing hash symbol (#
) #11021). With the previous behavior it was impossible to have an image file containing#
or?
.runSilently
[API Change] (Albert Krewinkel). The function runs an action in the PandocMonad, but returns all log messages reported by that action instead of adding them to the main log.getRequestHeaders
,setRequestHeaders
,getSourceURL
,getTrace
. [API change]stManager
field. This allows us to cache the HTTP client manager and reuse it for many requests, instead of creating it again (an expensive operation) for each request. This fixes a memory leak and performance issue in files with a large number of remote images (conver md to docx,out of memory #10997).Lua subsystem (Albert Krewinkel):
pandoc.structure.unique_identifier
.pandoc.text.superscript
andsubscript
.PANDOC_STATE
is no longer a userdata object, but a table that behaves like the old object. Log messages inPANDOC_STATE.log
are now in temporal order.pandoc.path.exists
.normalize
function to Pandoc objects (Integrate Text.Pandoc.Builder with lua filter infrastructure #10356). This function performs a normalization of Pandoc documents. E.g., multiple successive spaces are collapsed, and tables are normalized such that all rows and columns contain the same number of cells.pandoc.system
. Functions that expect UTF-8-encoded filenames should make it easier to write platform-independent scripts, as the encoding of the actual filename depends on the system. In addition, there is a new generalized method to run commands, and functions to retrieve XDG directory names. The new functions arecommand
,copy
,read_file
,remove
,rename
,times
,write_file
,xdg
.pandoc.system.list_directory
(Make return value ofpandoc.system.list_directory
a List #11032).MANUAL.txt:
xml
as input/output format.doc/lua-filters
:pandoc.Cite
(Albert Krewinkel).walk
methods #10995). Use thePandoc:walk
method instead.doc/extras.md: Fix link to pandoc-mode (Erik Post).
doc/lua-filters.md: Add example on using pandoc.Table constructor (Add example in documentation on using pandoc.Table constructor. #10956, Sean Soon).
Update
default.csl
from new chicago-author-date.csl, which is now for the 18th edition.Use latest releases of citeproc, typst-hs, texmath, doclayout, skylighting-core, skylighting.
This discussion was created from the release pandoc 3.8.
Beta Was this translation helpful? Give feedback.
All reactions