Skip to content

Latest commit

 

History

History

README.md

A syllabus framework for PDF and HTML

The goal

The goal is to generate a printable PDF and an HTML syllabus from the same set of files. The PDF has to look nice and it has to have a generated bibliography; the HTML has to be clean and simple in order to support good CSS styling in whatever website setup I am working with.

I figured out how to do this using xelatex, biblatex and biber, pandoc, and tex4ht. Nothing about this process is particularly complicated, but it's amazing how many minor fiddly little steps are needed to reach the goal using these tools. TeX is both amazingly capable and amazingly finicky. So I'm making my setup available to save others who use LaTeX and markdown the labor of reinventing all the fiddly steps as they reach for the same goal.

Requirements

Look, this is just my jury-rigged personal system. I have no idea if anyone else can make it work. However, it is in principle cross platform, and I would be glad if other techie humanists were inspired to make it go. I can at least tell you that this requires a modern TeX distribution (TeX Live 2011 or later should do it), which will include xelatex, biblatex, biber, and tex4ht, as well as all the LaTeX packages invoked in the source files here. I use a font you may well not have, so you'll have to tinker with fonts. It also requires pandoc. Finally, it scripts pandoc using Haskell, which means---I think it is safe to say---that this is currently the only publicly available humanities syllabus web framework with a ghc dependency. I installed ghc on my Mac using homebrew. Please get in touch with me if you'd like a pre-built OS X binary instead. At some point I may rewrite the Haskell in something friendlier. Or you can. It doesn't do anything fancy.

The approach

Files, lots of files.

Each section of the syllabus is in a markdown source file: overview.md, reqs.md, grading.md, and so on. Put the title of the section as a level 1 header (# header1). The marvelous pandoc is used to transform markdown into LaTeX. I use biblatex \cite commands freely in the sources; pandoc leaves these unchanged in the LaTeX. (Pandoc-markdown has its own citation syntax which relies on citeproc-hs and CSL, but these are not adequate for humanities citation.)

The ordering and layout of the PDF syllabus is controlled by syllabus.tex. The preamble to this file does the layout heavy lifting. The body of the document includes all the .tex files generated by pandoc with \input.

The HTML file syllabus.html is generated by a slightly Byzantine sequence of steps relying on the power of tex4ht. tex4ht rather than pandoc is needed to handle biblatex commands. Unfortunately tex4ht's completeness means its output is not clean or simple, so some additional processing is needed. tex4ht is applied to the file syllabus-4ht.tex, which is a lot like syllabus.tex without all the extra layout code in the preamble (and without the xelatex dependency). It includes the syllabus section .tex files. (Well, it includes slightly processed versions of those files, which are generated automatically.) However, you could edit syllabus-4ht.tex any way you wanted. The additional processing is done partly by tex4ht itself (controlled by syllabus-4ht.cfg) and partly by a Haskell script using the Pandoc Haskell library (html_clean). This script also decrements header levels (h2 becomes h1, etc.).

syllabus.html is just the contents of an HTML body, not a well-formed document. This is by design, since in my current configuration I am (sigh) pasting the HTML into a wordpress edit box. If you have a more respectable upload strategy, you will want to add yet another postprocessing step in which syllabus.html gets the necessary <html>, <head>, <body> (etc.) tags.

The "interface"

The tediousness of the "build" process cries out for a Makefile to automate it. I have supplied one.

make syllabus.pdf makes the PDF using pandoc, xelatex, and biber.

make syllabus.html makes HTML. It generates preprocessed syllabus-section TeX files under the directory 4ht/. It runs pdflatex on syllabus-4ht.tex to generate some needed auxiliary files (you also get a uselessly plain syllabus-4ht.pdf, which you can ignore), then runs tex4ht on syllabus-4ht.tex, which makes an intermediate file syllabus-4ht.html, and finally postprocesses that into the syllabus.html.

make clean cleans up the truly astonishing number of intermediate files this process generates.

make publish is left for you to fill in as needed. Mine just copies syllabus.html to the OS X clipboard.

make html_clean compiles the Haskell Pandoc script html_clean.hs that I use to make some needed extra tweaks.

An alternate "publish" strategy

In the past I have also used jekyll with a very similar set of files. The strategy for making the PDF with latex remains the same, and the strategy for making the website is to run jekyll. There is one subtlety, which is that your syllabus-section .md files have to have simple YAML headers for jekyll, which you should strip away with a perl script or similar in order to produce the LaTeX files used to make syllabus.pdf.

Jekyll has the advantage of allowing you to make a site instead of just a page, with separate pages for each section, and a very simple, fairly flexible framework for customizing the styling with included snippets of code and Liquid templating. Jekyll's blogging feature has a natural application as course announcements. I haven't supplied a jekyll setup here because I moved to wordpress sites for my courses before I hacked up the biblatex-processing steps. In order to get that to work, any syllabus section making use of biblatex would need to be run through tex4ht first (you would need a stub document file like syllabus-4ht.tex for each such source). Then take the resulting HTML file, clean it up as I do here, stick a YAML header on it, and feed that to Jekyll (which is perfectly happy to operate on HTML as well as markdown).

However, a Jekyll site can never be a multi-user site, which is why I switched to wordpress. Unfortunately, since I only have browser access to my wordpress sites myself right now, I can't upload my shiny HTML programmatically. That's why the publish target in the Makefile just copies syllabus.html to the OS X clipboard. You may be able to do better. If you have a whole website, of course, as you would with jekyll, copy and paste is out of the question. When I was Jekyll'ing I had ssh access to my web server and used rsync.

A complete example

In example/ I've provided the sources I used to produce a Spring 2013 Rutgers University syllabus of mine for English 596, Author, Reader, Field. See example/README.md.

Getting just the syllabus template and not the rest of agoldst/tex

This is in my omnibus TeX-stuff repo, because because. If you want only this directory as a repo, well then, you just have to remember the magical git commands for stripping away all of a repo except for a subdirectory. I don't, but google holds the answers; see under "Making a Subdirectory the New Root" in Pro Git, chapter 6, section 4.

git clone agoldst/tex
cd tex
git filter-branch --subdirectory-filter syllabus HEAD