Date Added: Jun 2013
False-positives are a problem in anomaly-based intrusion detection systems. To counter this issue, the authors discuss anomaly detection for the eXtensible Markup Language (XML) in a language-theoretic view. They argue that many XML-based attacks target the syntactic level, i.e. the tree structure or element content, and syntax validation of XML documents reduces the attack surface. XML offers so-called schemas for validation, but in real world, schemas are often unavailable, ignored or too general. In this work-in-progress paper they describe a grammatical inference approach to learn an automaton from example XML documents for detecting documents with anomalous syntax.