Skip to content
David Chiang edited this page Apr 15, 2019 · 6 revisions

Much of the structure of the XML format is specified by the RELAX NG schema (data/xml/schema.{rnc,rng}) and can be validated automatically. This document describes the structure less formally and also describes aspects of the format that aren't specified by the schema.

Structure

The root element is <volume id="X99">, where X is replaced by the one-letter code for the venue and 99 is replaced by the last two digits of the year.

The <volume> element has child elements <paper id="9999">, where 9999 is replaced by the four-digit paper identifier. For some venues (LREC), there is also an href attribute for the external URL of the paper.

Each <paper> element has several child elements:

  • <title>: The title (see below for more details)
  • <author>: The authors (see below for more details)
  • <editor>: The editors (see below for more details)
  • and others.

Text

Text fields (<title>, <author>, etc.) are written in Unicode (UTF-8). The following elements are currently allowed for formatting:

  • <tex-math>: math formulas, coded using TeX (equivalent to TeX $...$). For example: An <tex-math>O(n^3)</tex-math> Algorithm for Parsing Context-Free Grammars.
  • <url>: a URL
  • <i>: italics
  • <b>: boldface

Title

The title should be written in title-case. The Anthology doesn't currently have strict rules about details of title-casing. The title should follow the guidelines above for text. Additionally, characters which should be preserved even when a bibliography style uppercases or lowercases the title should be placed inside a <fixed-case> element (this serves the same purpose as curly braces in BibTeX). For example:

<title>The <fixed-case>ACL</fixed-case> <fixed-case>A</fixed-case>nthology: Current State and Future Directions</title>

Authors and Editors

Clone this wiki locally