A Context-Free Markup Language for Semi-Structured Text

Date Added: Jul 2009
Format: PDF

An ad hoc data format is any non-standard, semi-structured data format for which robust data processing tools are not available. In this paper, the authors present ANNE, a new kind of mark-up language designed to help users generate documentation and data processing tools for ad hoc text data. More specifically, given a new ad hoc data source, an ANNE programmer will edit the document to add a number of simple annotations, which serve to specify its syntactic structure. Annotations include elements that specify constants, optional data, alternatives, enumerations, sequences, tabular data, and recursive patterns.